Skip to content

perf(schemaview): incremental walker + safety net (audit only)#10002

Draft
asheshv wants to merge 21 commits into
pgadmin-org:masterfrom
asheshv:dev/incremental-audit
Draft

perf(schemaview): incremental walker + safety net (audit only)#10002
asheshv wants to merge 21 commits into
pgadmin-org:masterfrom
asheshv:dev/incremental-audit

Conversation

@asheshv
Copy link
Copy Markdown
Contributor

@asheshv asheshv commented Jun 3, 2026

Incremental schema walker + safety net

Introduces an incremental schema walker (O(visited) instead of O(total)
field re-evaluation per keystroke — 2–4× faster on 500–1000-row
dialogs), flips it on globally by default, and ships the safety
infrastructure to make that flip safe.

The walker prunes via mustVisit = changedPath ∪ depDests(changedPath) ∪ knownErrorPaths + structural sharing of
unvisited subtrees. Bug class: closures reading sibling rows
without declaring field.deps see stale data once pruning is on.
This PR makes that detectable in CI.

What lands

1 — Walker prototype

Incremental schemaOptionsEvalulator + validateSchema. Structural sharing of unvisited collection rows, mustVisit derived from changedPath ∪ depDests(changedPath) ∪ knownErrorPaths. Initial opt-in per schema via incrementalOptions = true; window-level __INCREMENTAL_OPTIONS__ flag for canarying. Initial flip targets: Table, Column, Partition, Index, Domain. Six parent-row dep declarations added.

2 — Deferred-protocol & DataGridView hardening

Documents the deferredDepChange protocol (always-settling Promises, recovery callbacks, side-effect placement). Migrates Table / ForeignTable / ExclusionConstraint / Index / AzureCred schemas to the protocol. Drain useEffect ref-equality fix, queue accumulation, DepListener prefix-match protection, six small DataGridView typo / staleness / className fixes. Plus a deterministic race test for the drain.

3 — Divergence canary

Two side-by-side walkers that diff their output and console.error (or throw under tests) on mismatch. Build-time gated via webpack DefinePlugin substituting process.env.__CANARY_BUILD__ to a literal boolean; in default production builds the canary module is DCE'd entirely. Verified by a CI script (see #7).

4 — Schema registry + audit harness

registerSchema(Class) self-registration, ESLint rule, AST codemod that migrated 86 default-exported schemas in one shot. The audit harness (auditSchema) walks every registered schema with the canary on. Multi-path error tracking (_knownErrorPaths) gives incremental validation production-equivalent safety semantics. ADD_ROW / DELETE_ROW / MOVE_ROW / BULK_UPDATE / nested-fieldset dispatch coverage, 3-row collections with per-row sentinels for chained reads, 6-variant scalar mutations (empty / whitespace / unicode / long), exhaustive k-combination batches (k=2..4) with all-k primary rotations, persisted-prev 10-step sequence pass, fast-check property-based fuzzing on top.

5 — UI smoke (Playwright)

5 dialogs against a real pgAdmin instance with the canary's throw flag on: Register Server, Create Table, Create Function, Edit Table, Edit Function. Tree navigation via pgAdmin's JS tree API (the DOM-based path was brittle against react-aspen virtualization).

6 — Production-flip hardening

Three real bugs surfaced by the audit / smoke + their fixes:

  1. listenDepChanges registered NO listener for evaluator-only deps. Fields declaring deps but no depChange callback had no entry in the listener registry, so _collectDepDestsForPath couldn't include them in mustVisit — pruned silently. Fixed: register a listener for every declared dep regardless of callback presence.

  2. state.__lastChangedPath overwritten by React batching. Two sibling fixedRows promises (VacuumSettingsSchema's vacuum_table + vacuum_toast) resolving in one microtask tick: the second setUnpreparedData dispatch clobbered the first's path; validate ran with only one path; the walker pruned the other collection's newly-arrived rows. Fixed: __pendingChangedPaths accumulator; validate() consumes the full set.

  3. drainDeferredQueue bypassed the listener wrapper. Raw sessDispatch for every resolved deferredDepChange — tripped the bypass guard AND dropped the path from the accumulator. Fixed: route via sessDispatchWithListener.

Plus the bypass guard itself: reducer fires console.error (which CI's setup-jest expect(console.error).not.toHaveBeenCalled() afterEach catches) on any path-bearing action that reaches the reducer without the __viaListener sentinel. Canary-only — production tree-shakes it.

LRU cap (1024) on _knownErrorPaths with eviction telemetry. Edit-mode audit data seed (idAttribute + populated text fields). Thicker schema.state stub. Developer guide at web/pgadmin/static/js/SchemaView/README.md.

7 — CI workflows

  • check-canary-treeshake.yml: builds NON-canary bundle, runs scripts/verify-canary-treeshake.sh. Fails if any of runOptionsCanary, runValidationCanary, formatDivergence, applyAllowlist, __throw_on_canary_divergence__ appear in app.bundle.js.
  • run-schemaview-ui-smoke.yml: PG 16, seeds a server via setup.py load-servers, pre-creates a test table + function for edit-mode coverage, builds with CANARY_BUILD=true, starts pgAdmin, runs Playwright. Uploads server log + Playwright trace on failure.

8 — Jest infra cleanups

Mocks for 5 ESM-only deps (react-data-grid, react-dnd, react-dnd-html5-backend, react-resize-detector, marked) + global AMD define() stub. Unblocked 16 audit-blocked schemas and 10 pre-existing failing Jest suites along the way.

9 — Default-on flip

The walker is enabled globally by default — opt-out via incrementalOptions = false on the schema instance (only used by tests). A new INCREMENTAL_OFF=1 env var on the perf-bench harness lets the bench replay the old slow-walk path for same-session A/B comparison.

Test scoreboard

Suites:

  • Before: 155 passing, 8 failing
  • After: 167 / 167 passing

Tests:

  • Before: 1049
  • After: 1289 (+240)

Audit coverage:

  • 87 schemas × 3 modes (create / edit / properties)
  • 270+ dispatch shapes per schema across: scalar / cell / structure / MOVE_ROW / BULK_UPDATE / batched (k=2..4 with all-k primary rotations) / 10-step sequence with persisted prev / nested-fieldset / 500-run property-based fuzzing.

UI smoke:

  • 5 dialogs (3 create + 2 edit), ~25 s

Per-keystroke latency at 10 outer × 1000 inner rows (synthetic nested.spec.js, same-session A/B with INCREMENTAL_OFF=1):

  • Outer-field typing: 231 ms → 61 ms (3.79× faster)
  • Inner-cell typing: 346 ms → 173 ms (2.00× faster)

Branch contents

This branch was originally split as dev/incremental-audit (safety net) + dev/incremental-flip (the actual default-on flip), then combined into one PR for reviewer convenience. The combined branch is 6 logical commits + 1 merge + 2 CI workflow commits since master:

  1. Walker prototype
  2. Deferred-protocol & DataGridView hardening
  3. Divergence canary
  4. Schema registry & audit harness
  5. UI smoke (Playwright)
  6. Production-flip hardening
  7. (merge from dev/incremental-flip carrying the default-on flip + perf-bench INCREMENTAL_OFF env)
  8. CI workflows + lint cleanup
  9. UI smoke workflow polish (setup-db, pre-seed table/function)

asheshv added 13 commits June 3, 2026 10:07
Replaces O(total fields) per-keystroke walks of the schema option
tree with O(visited fields) pruning. Two walkers are made
incremental:

  schemaOptionsEvalulator
    - Threads `changedPath` + DepListener-derived `depDests` through
      every level of recursion as `mustVisit`.
    - Collection rows whose globalPath doesn't overlap any
      mustVisit entry are kept as their previous-walk options via
      structural sharing (the row's whole subtree retains its
      previous object reference, so downstream subscribers
      short-circuit on Object.is).
    - Nested-tab / nested-fieldset / inline-groups share the
      parent's data level; recursion threads the same prev-options
      slice through.

  validateSchema
    - Same prune logic but for error map: rows outside mustVisit
      keep their prior error state instead of being re-walked.

Cross-row reads remain the known correctness hazard: a closure
(`editable`, `disabled`, `visible`, `readonly`, `validate`) that
reads a sibling row without declaring the source as `field.deps`
will silently see stale data. This commit ships the prototype with
that hazard documented on the file; the safety net (canary +
audit harness) lands in subsequent commits.

Other supporting changes:

  - Per-schema opt-in (`incrementalOptions = true`) and a
    window-level `__INCREMENTAL_OPTIONS__` flag for canarying
    without rebuilding plumbing. Initially flipped on for
    TableSchema, IndexSchema, PartitionSchema, ColumnSchema,
    DomainSchema; expanded later via the audit ratchet.
  - Six grid-cell evaluators (partition values, exclusion ops,
    index storage params, etc.) get explicit `field.deps`
    declarations for sibling-row reads.
  - DepListener-reverse-deps fold into the walker so cross-row
    deps stay correct under incremental mode.
  - Subscribe hooks pin useEffect deps so the SubscriberManager
    doesn't tear subs down/up on every render (C.1).
  - SubscriberManager.signal stops re-creating subscriptions (C.2).
  - DataGridView mappedCell.jsx render path trimmed.
  - checkUniqueCol short-circuits when the changedPath can't
    affect any uniqueCol.
  - Performance bench harness (Playwright) with INCREMENTAL=1 flag
    for same-session A/B comparison + gated instrumentation
    (record() / measure() / count() in perf.js).
  - Per-incremental-options + parent-row deps Jest tests.

Performance (same-session A/B, nested.spec OUTER=10):
  INNER=500:  outer typing 2.60x, inner typing 1.79x faster
  INNER=1000: outer typing 3.79x, inner typing 2.00x faster
  OUTER=500 INNER=10: both axes 1.88x faster.
Cluster of correctness fixes that fell out of building the
incremental walker. None alter user-visible behavior on their own;
together they harden the dep-listener plumbing and the deferred-
dependency-change queue so the walker can rely on consistent
inputs.

Deferred-dep protocol
---------------------
Establishes a documented contract for `field.deferredDepChange` so
schemas can return Promises without leaking pending updates or
losing recovery paths:

  - Return undefined to opt out (no Promise queued).
  - Otherwise return a Promise that ALWAYS settles. On success,
    resolve with `(tmpstate) => deltaObj`. On failure, prefer
    resolving with a recovery callback that resets in-progress
    flags rather than rejecting.
  - Side effects belong inside the Promise body BEFORE resolving;
    exception: side effects whose input legitimately depends on
    drain-time state may live in the returned callback.

Five existing schemas were migrated to the protocol (cleanup of
inline rejections and unresolved-promise leaks):

  - TableSchema typname + coll_inherits
  - ForeignTable.inherits
  - ExclusionConstraint.amname
  - Index.amname
  - AzureCredSchema.is_authenticating

Drain-queue plumbing
--------------------
  - Append to __deferred__ instead of replacing: two SET_VALUEs in
    the same React batch each contribute their pending promises;
    replacing the array dropped the earlier action's.
  - useEffect depends on the array REFERENCE, not its length:
    React's auto-batching can round-trip length 0 → N → 0 in one
    commit, and length-based equality misses the round-trip.
  - Prefix-match protection in DepListener.getDeferredDepChange so
    a listener bound at ['a'] doesn't accidentally fire on ['ab'].
  - Source path snapshot in addDepListener so caller mutations to
    the array after registration can't corrupt the listener entry.
  - Drain protocol guard: resFunc must be a function — log and
    skip otherwise. Rejected promises surface via notifier.error
    (with console.error fallback when notifier is unavailable, eg
    in jest harnesses).
  - Protocol-violation log promoted warn → error so test suites
    that fail on console.error catch the regression.

DataGridView + schema misc
--------------------------
  - DepListener caches joined source keys (the inner loop hot path
    in the walker's prev-dep lookup); early-bail when no listener
    registered.
  - mappedCell.jsx render path trimmed.
  - Six small bug fixes that surfaced during the audit:
      * row className uses join(' '), not join[' '] (was concat
        of literal " ")
      * 'priorty' → 'priority' typo in DataGridView feature
        comparator
      * TableSchema typname callback guarded against stale
        ofTypeTables (race when user changes type quickly)
      * typname changeColumnOptions moved out of resolved callback
        so it runs synchronously before drain
      * Azure auth_btn compares against the last source segment
        (was full path) — fix surfaces auth failures correctly
      * Azure clears stale auth_code when auth fails, resets
        is_authenticating
      * Schema inherits REMOVE branch guards against undefined
        getTableOid result
      * Stale columns cleaned up on same-length inherits swap
  - Tests: deterministic race test for drain useEffect dep array
    (verifies APPEND semantics under fast double-dispatch).
  - perf-bench nested.spec timeout raised from 300s to 500s.

Prerequisite for the canary safety net and the audit harness that
follow.
The incremental walker prototype prunes rows whose subtree the
changedPath cannot affect. Cross-row reads that aren't declared as
`field.deps` will silently see stale data — the prototype's known
limitation. Without an automated detection mechanism, every schema
flipping incremental on would need a manual review of every
closure in its `editable` / `disabled` / `visible` / `readonly` /
`validate` callbacks.

The canary is that detection mechanism. Two canaries land here:

  options/canary.js (runOptionsCanary)
    Wraps schemaOptionsEvalulator. When `window.__INCREMENTAL_AUDIT__`
    is set, runs BOTH the full walk and an incremental walk against
    the same prev-options + sessData baseline, diffs the resulting
    options trees, and reports any divergence.

  SchemaState/validation_canary.js (runValidationCanary)
    Same shape but for validateSchema's error map. Mirrors the
    options canary almost exactly so the two stay in lockstep.

Routing (defaultReport):
  - Production with `window.__incremental_canary_endpoint__`
    configured: navigator.sendBeacon to that endpoint.
  - Test env (NODE_ENV=test) with `__throw_on_canary_divergence__`:
    throws an Error with the diff — what Jest assertions catch.
  - Otherwise: console.error.

The branches are mutually exclusive so the test suite's
`expect(console.error).not.toHaveBeenCalled()` afterEach doesn't
fight the "thrown" path.

Build-time gate
---------------
`process.env.__CANARY_BUILD__` is substituted to literal `false`
(or true under `CANARY_BUILD=true yarn run bundle`) via webpack
DefinePlugin. In a default production build the entire canary
branch + its `require('./canary')` is dead-code eliminated. The
canary module ships ZERO bytes to end users.

The DefinePlugin substitutes a literal boolean (`process.env.X` →
`true`/`false`), NOT `JSON.stringify(...)` — `JSON.stringify(false)`
would yield the string "false" which is truthy in JS and defeats
DCE in the false branch.

Verification
------------
  - V5/M4 integration tests confirm both canaries fire on
    synthetic schemas with intentional cross-row divergence
    (option-side AND validation-side patterns).
  - V6 bundle smoke test confirms the canary doesn't appear in a
    non-canary build (later promoted to a CI shell script in the
    production-hardening commit).
  - Multiple aggressive-review fixes wired in to the canary
    wrapper: allowlist with TTLs for known-stale entries,
    per-session throttle so production sampling doesn't pay the
    2x walk cost on every keystroke, FIELD_OPTIONS diff format
    that's stable for grep-based debugging.

Together these form the safety net that lets the incremental
walker stay on by default for every schema — divergence becomes a
caught error, not a silent UI bug.
Production-flip blocker: the canary safety net catches divergence
when it fires, but only against schemas that have been opened in a
browser with `__INCREMENTAL_AUDIT__` enabled. To gate the global
flip behind CI, every registered schema needs synthetic dispatch
coverage that runs on every PR.

Three layers:

  schema_registry.js
    `registerSchema(SchemaClass)` records every BaseUISchema
    subclass at module-load time. `getRegisteredSchemas()` returns
    the registry as a Map.
    Throws on anonymous classes / non-class arguments so a future
    contributor can't accidentally register a factory function.

  ESLint rule (eslint-plugins/local-rules/register-schema.js)
    Flags any `export default class extends BaseUISchema {...}`
    that isn't wrapped in `registerSchema(...)`. Makes the
    registration contract enforceable in code review.

  AST codemod (scripts/codemod-register-schema.js)
    @babel/parser-driven, idempotent. Rewrites
      `export default class Foo extends BaseUISchema {...}`
    into
      `class Foo extends BaseUISchema {...}
       export default registerSchema(Foo);`
    Migrated 86 default-exported schemas in one shot.

  audit_harness.js
    The per-schema audit runner. For one SchemaClass:
      1. tryInstantiate with several constructor signatures
         (no-args, fieldOptions+initValues, generic stub fixtures).
      2. Generate default sessData via schema.getNewData({}).
      3. Establish baseline via the FULL walk (the canary's
         reference) to populate prev-options.
      4. For each scalar field, dispatch a synthetic SET_VALUE
         with the canary on; divergence throws and the calling
         spec fails fast with the diff.
      5. Same for collection cells, plus ADD_ROW / DELETE_ROW
         structural dispatches at the collection root.

  registered_schemas_audit.spec.js
    Loops over the registry and calls auditSchema for each. A
    KNOWN_DIVERGING allowlist (initially empty, the ratchet) lets
    expected-to-diverge schemas stay GREEN until they're fixed —
    when a schema is fixed and stops diverging, the test starts
    FAILING because the divergence didn't happen. That's the
    signal to remove it from the list.

Other supporting changes:
  - Multi-path error tracking: SchemaState now maintains
    `_knownErrorPaths` so previously-erroring rows ride mustVisit
    forever after, even if the user's current change doesn't
    touch that row. Catches re-errors without a full walk.
  - TableSchema partition fields get declared `field.deps` so the
    canary stays clean across the existing flip targets.
  - Canary resilience: throttle cap, allowlist TTL semantics,
    per-schema canaryAllowedDivergences hook for triage.
  - Tree-shake sentinels extended to audit_harness so the audit
    module itself is dead-code eliminated from production builds.
  - tryInstantiate failure messages report the RICHEST attempt's
    error, not the no-args one (which is almost always "X is not
    a function" and masks the real problem).
  - RoleSchema-specific instantiation fixture (its constructor
    needs a real userInfo arg that the generic fallbacks can't
    synthesize).
Opens real pgAdmin dialogs in a real browser with __INCREMENTAL_AUDIT__
+ __throw_on_canary_divergence__ enabled. Asserts no console.error or
pageerror at end. Covers three create dialogs: Register Server, Create
Table, Create Function.

audit-helpers.js holds shared boot/error-recorder utilities + an
ensureServerRegistered that uses a pre-seeded SQLite ID. Requires
CANARY_BUILD=true bundle so the canary stays in the bundle.

Tree navigation is DOM-based here (best-effort against react-aspen
virtualization). The JS tree API variant lands in the production
hardening group.
Three real bugs found by the audit + UI smoke:

  listenDepChanges registered NO listener for fields with declared
  deps but no depChange callback. The walker's
  _collectDepDestsForPath couldn't resolve evaluator-only deps so
  rows whose closures read sibling state via field.deps were
  silently pruned (vacuum_table.value editable was the canary's
  first catch).

  state.__lastChangedPath was a single scalar overwritten by React
  batching. Two sibling fixedRows promises (vacuum_table +
  vacuum_toast) resolving in one microtask lost one path entirely.
  Replaced with __pendingChangedPaths accumulator + validate()
  consumes the full set; legacy __lastChangedPath kept as a
  back-compat shim only when the accumulator is empty.

  drainDeferredQueue dispatched raw sessDispatch, bypassing the
  listener wrapper. Every deferredDepChange's DEFERRED_DEPCHANGE
  fired the bypass guard AND dropped its path from the accumulator.
  Now routed via sessDispatchWithListener.

Audit + smoke infrastructure additions:

  - Per-schema fixtures for the last 12 audit SKIPs.
  - 87 schemas x 2 modes (create + edit) with realistic edit-mode
    data seed (idAttribute + populated text fields) and a thicker
    schema.state stub so closures reading top.state.X see real
    shapes.
  - MOVE_ROW + BULK_UPDATE passes.
  - 3-row collections with per-row sentinels for chained reads.
  - 6-variant scalar mutations (empty / whitespace / unicode / long
    / two sentinels).
  - Batched-dispatch combinations k=2..4 with up to 2 rotations per
    combo.
  - 10-step sequence pass with persisted prev across dispatches
    (closes the bug-class that started this whole session: stale
    prev across batched commits).
  - Edit-mode UI smoke (Edit Table, Edit Function) using a new
    openEditDialogViaApi helper.
  - UI smoke navigates via pgAdmin's JS tree API instead of DOM
    expansion (works around react-aspen virtualization).

Production safety:

  - LRU cap (1024 entries) on _knownErrorPaths with insertion-order
    eviction + recency refresh; one-shot console.warn under canary
    on first eviction + perf-counter telemetry on every eviction.
  - Bypass guard: reducer fires console.error on path-bearing
    actions (SET_VALUE, ADD_ROW, DELETE_ROW, MOVE_ROW, BULK_UPDATE,
    DEFERRED_DEPCHANGE) missing the __viaListener sentinel. Under
    canary builds only; production tree-shakes it out. Fails Jest
    because setup-jest asserts console.error is never called.
  - CI script web/scripts/verify-canary-treeshake.sh greps the
    non-canary production bundle for canary-only symbols; non-zero
    exit if any leak.
  - Jest mocks for 5 ESM-only deps (react-data-grid, react-dnd,
    react-dnd-html5-backend, react-resize-detector, marked); plus
    a global define() shim for AMD modules. Together these unblock
    16 audit-blocked schemas + 10 pre-existing failing suites.
  - Developer guide at web/pgadmin/static/js/SchemaView/README.md
    covering the correctness contract, deps syntax, canary build,
    audit layers, dispatch rules, and common pitfalls.

Test results: 166/166 suites, 1201/1201 tests in 47s. UI smoke 5/5
dialogs in 25s. Same-session perf bench unchanged: 10x1000 outer
typing 3.79x faster, inner typing 2.00x faster.
…es (pgadmin-org#9985)

The container previously applied CAP_NET_BIND_SERVICE to the python
interpreter so the non-root pgadmin user could bind to ports 80/443.
Some platforms refuse to honor file capabilities:

  - --cap-drop=ALL / OpenShift restricted-v2 SCC zero the bounding set,
    so the kernel returns EPERM on exec of any capability-tagged binary.
    This makes the image fail to start (issue pgadmin-org#9657).
  - --security-opt=no-new-privileges / allowPrivilegeEscalation: false
    causes the kernel to silently strip file capabilities on exec, so
    the binary runs but a subsequent bind() to <1024 still fails.

Split the interpreter so neither default behavior nor restricted-runtime
support has to give up the other:

  - Dockerfile copies python3.X to /usr/local/bin/python3-cap and applies
    setcap to the copy. /usr/local/bin/python3.X stays un-capped, so
    /venv/bin/python3 (which symlinks to it) execs cleanly under
    restricted SCCs. A parallel /venv/bin/python3-cap symlink keeps the
    venv activation working when the capped interpreter is used.
  - entrypoint.sh reads /proc/self/status at startup. If NoNewPrivs is
    set, or CAP_NET_BIND_SERVICE is missing from the bounding set,
    gunicorn is invoked through the un-capped python and (when
    PGADMIN_LISTEN_PORT is unset) the default port falls back to 8080
    for plain HTTP or 8443 for TLS. A startup message records the
    choice.
  - Existing deployments with the default 80/443 mapping are unaffected:
    on every unrestricted runtime the bounding set still contains
    NET_BIND_SERVICE and gunicorn runs through the capped interpreter
    exactly as before.
  - PGADMIN_LISTEN_PORT, if set, is honored in both paths.

Docs gain a "Restricted Security Contexts" subsection covering the new
auto-detected fallback and the OpenShift / --cap-drop=ALL invocation.

Fixes pgadmin-org#9657
…ations

Six audit/doc polish items from the post-rebase punch list, bundled
because they share the same harness file and run as one Jest pass.

  Properties mode coverage
  ------------------------
  MODES = ['edit', 'create', 'properties']. The 'properties' mode
  filters fields by mode containing 'properties' — a different
  field subset than create/edit. Read-only display dispatches no
  user input but the walker still runs on every prop change, so
  divergence under properties is a real bug class.

  Tests grow from 174 to 261 (87 schemas x 3 modes); all pass.

  Nested-fieldset / nested-tab / inline-groups dispatch
  -----------------------------------------------------
  New auditNestedFields pass. Walks one level into every group
  field of type nested-fieldset / nested-tab / inline-groups,
  dispatches a SET_VALUE on each scalar inside. These shared the
  parent's data level but lived in the walker's nested branch (a
  different code path from auditScalars); bugs that only
  manifested in that branch slipped through.

  Production users covered: publication, trigger, table, index,
  type, sequence, pga_schedule, compound_trigger and others.

  All-k rotations in auditBatched
  -------------------------------
  MAX_ROTATIONS_PER_COMBO bumped from 2 to Infinity (effectively
  k). Each path in a k-combo gets a turn as `primary` instead of
  just the first two — closes the order-dependence gap (for k=4
  was 2/24 perm coverage, now 4/24 with every PATH primary
  covered).

  Runtime impact: audit suite went from 52s to 60s.

  RERENDER action documentation
  -----------------------------
  RERENDER is declared in SCHEMA_STATE_ACTIONS but has NO reducer
  case and NO production dispatch site. Documented in the enum as
  reserved/unused with an explicit note: if a future caller starts
  using it, they MUST add it to reducer.js's PATH_BEARING_ACTIONS
  set so the bypass guard catches accidental raw sessDispatch.

  Mutation-counter reset in setup-jest
  ------------------------------------
  Module-level capture of _resetMutationCounter (resolved once at
  worker boot, called in beforeEach). Without this, the 6-variant
  text-mutation cycle's position bleeds across specs, making CI
  failures harder to reproduce locally. The resolve must happen at
  module load (not inside beforeEach) because requiring the audit
  harness from beforeEach transitively pulls in the zustand mock
  whose top-level afterEach() registration would then run in
  test phase — caught a 15-suite regression during verification.

  Developer guide additions
  -------------------------
  README.md now includes:
    - Action types table with bypass-guard coverage column
      (SET_VALUE / ADD_ROW / DELETE_ROW / MOVE_ROW / BULK_UPDATE /
      DEFERRED_DEPCHANGE guarded; INIT / CLEAR_DEFERRED_QUEUE
      exempt; RERENDER reserved)
    - "Reading canary divergence output" section with worked
      example (the vacuum_table divergence pattern) + the three
      root-cause patterns surfaced this session
    - Updated bypass-guard description (console.error, not warn)

Test scoreboard:
  Before: 1201/1201 in 47s
  After:  1287/1287 in 72s  (+86 from new modes/passes)
Adds fast-check (devDependency) and a fuzz spec that drives RANDOM
k-batches across every registered schema in every mode. Closes the
last open coverage item from the post-review punch list.

What the fuzzer adds over deterministic auditBatched:
  - Random batch sizes (k in [2, 6]) instead of fixed k in {2, 3, 4}.
  - Random path-INDEX permutations (deterministic sweep uses fixed
    candidate emission order + k primary rotations; fuzzer covers
    the cross-product of extras orderings the rotations can't).
  - Shrinking. When a property fails, fast-check shrinks the
    schema/mode/index-array to a minimal reproducer — the
    counterexample lands in the test failure message and points
    new contributors at the smallest path-set that triggers
    divergence.

The fuzzer's payoff isn't today (every fuzz run is GREEN); it's the
day someone introduces a closure with an undeclared cross-row read
and the audit's deterministic sweep misses the specific batch
shape. The shrinker will pin it down without manual triage.

  fuzzBatchAgainst (audit_harness.js export)
    Single-batch driver: takes (SchemaClass, mode, pathIndices),
    instantiates, seeds, runs the FULL mount-time validate to
    populate knownErrorPaths (the harness equivalent of
    SchemaState's _knownErrorPaths populated on mount), then
    drives one batched dispatch against the canary. Returns
    {ok, candidates, batch, message} so fast-check's shrinker can
    work the minimal counterexample.

    Path indices are taken modulo the candidate list length so any
    input array of length >= 2 maps to a valid k-combination;
    duplicates dedupe silently so the shrinker can collapse same-
    path inputs without us special-casing.

  audit_fuzz.spec.js
    Property: for any random (schemaName, mode, pathIndices), the
    incremental walker output equals the full walker output.
    Defaults to 500 numRuns; total runtime ~3s with shrinking
    disabled in this happy-path config (verbose: false). Total
    Jest suite goes from 1287 to 1288 tests in 75s.

One real harness bug surfaced while writing this — initial
fuzzBatchAgainst skipped the mount-time validate that populates
knownErrorPaths. Production always validates on mount BEFORE any
user dispatch, so the incremental walker's mustVisit always
includes any pre-existing error paths. Without the prep, the
fuzzer's first incremental walk pruned paths that production
wouldn't and reported false-positive divergences on schemas with
pre-existing validation errors (TriggerFunctionSchema's
seclabels.*.label was the first shrunk repro). Fix: run the
discovery walk before the batched dispatch. The same fix was
already present in auditSchema; fuzzBatchAgainst now mirrors it.
Two new workflows that close the CI gap on the incremental walker's
safety stack. The existing run-javascript-tests.yml already picks
up the new 1288-test Jest suite (setup-jest.js sets
__CANARY_BUILD__='true' unconditionally), so the audit + canary +
fuzz + sequence + nested-fieldset + properties-mode + 3-row +
6-variant + all-k-rotation passes all run in CI today. The two
remaining gaps are:

  1. The canary tree-shaking gate
  ------------------------------
  check-canary-treeshake.yml: lightweight ubuntu-22.04 job that
  builds a NON-canary production bundle (CANARY_BUILD intentionally
  unset) and runs scripts/verify-canary-treeshake.sh. The script
  greps the resulting app.bundle.js for canary-only symbols
  (runOptionsCanary, runValidationCanary, formatDivergence,
  applyAllowlist, __throw_on_canary_divergence__). If any leak,
  the DefinePlugin gate failed and the canary (~7 KiB) is
  shipping to end users along with the 2x walk cost on every
  keystroke.

  No PG service, no Python, no pgAdmin runtime — just node +
  webpack + grep. Runs on every PR.

  2. The Playwright UI smoke
  --------------------------
  run-schemaview-ui-smoke.yml: ubuntu-22.04 job that:
    - installs PostgreSQL 16 from PGDG, starts it on port 5916
    - creates web/config_local.py with desktop mode, no master
      password, no OS secret storage (so the smoke's connect
      flow can drive the "Connect to Server" prompt without an
      Unlock-Saved-Passwords detour)
    - pre-registers a server via `python setup.py load-servers`
      from /tmp/servers.json (the smoke's
      ensureServerRegistered helper looks for an existing server;
      without seeding it finds none in fresh CI)
    - builds with CANARY_BUILD=true so the canary stays in the
      bundle
    - launches pgAdmin in the background + polls /browser/ until
      it responds
    - installs Playwright + chromium
    - runs audit-smoke.spec.js with PGADMIN_SERVER_NAME=CI-PG16
      pointing at the seeded server

Five dialogs covered: Register Server (canary on top-level
scalars + DataGridView dispatches), Create Table (TableSchema +
ColumnSchema + Constraints + Partition), Create Function
(FunctionSchema + NodeVariableSchema), Edit Table (TableSchema
in edit mode against an existing system table), Edit Function
(FunctionSchema in edit mode).

Failure artifacts: pgAdmin server log + Playwright trace +
screenshots uploaded on any non-success path so a flake or real
divergence can be triaged from the failed run.

Security audit of both files: only ${{ github.* }} interpolation
is concurrency.group (a YAML-level identifier, not shell context).
All run: blocks use static content. env: blocks use hardcoded
constants. No untrusted-input → shell paths. Matches the existing
workflow security pattern in the repo.
The audit harness ratchet on dev/incremental-audit reached zero —
all 86 registered schemas pass cleanly under the incremental walker.
Flip the conditional in SchemaOptionsEvalulator (options/registry.js)
and SchemaState.validate so any dispatch with a concrete changedPath
takes the pruned path. Schemas / dialogs that genuinely need
full-walk semantics can opt out by setting `incrementalOptions: false`
on viewHelperProps or the schema instance; the global
`window.__INCREMENTAL_OPTIONS__ = false` is an emergency-rollback
escape hatch that disables it everywhere without rebuilding.

SchemaState.updateOptions now propagates schema-level opt-out into
viewHelperProps so the walker (which only reads vhp) honors it.
The mirror logic for opt-IN stays for back-compat in case the
default ever shifts back.

Test updates:
- The "incremental (window global opt-in)" describe block flipped
  to "window global escape hatch" — afterEach uses `delete` instead
  of `= false` so the cleanup doesn't accidentally leave the
  opt-out signal set for subsequent tests.
- "schema without incrementalOptions runs full walk" now asserts
  default-on (the schema needs an explicit `false` to opt out);
  added a separate test for the explicit opt-out path.

Full jest suite: 1097/1097, including the audit harness reporting
zero divergences across all registered schemas.
The datagridview perf bench compares per-keystroke + per-action
costs in pgAdmin's SchemaView. To validate the incremental walker
flip (default-on), the spec needs to run twice in the same session
— once with incremental ON (default), once OFF — so cross-session
drift doesn't muddle the delta (see memory note about same-session
A/B requirement).

`INCREMENTAL_OFF=1` toggles `window.__INCREMENTAL_OPTIONS__ = false`
before each dispatch, opting out of the flip's default-on
behavior. Result files prefixed `on-` vs `off-` so a diff is
straightforward.

A/B results captured on this branch (Register Server > Parameters,
100 rows, 20 keystrokes per scenario):

  Grid cell typing:        validate -33%  (3.40ms vs 5.07ms / dispatch)
  General Name typing:     validate -36%  (3.25ms vs 5.12ms / dispatch)
  schemaOptionsEvalulator: -85% to -86%
  SchemaState.updateOptions: -84% to -86%
  validateSchema:          +6% to +8% (collectAll's no-short-circuit cost)
  setError calls:          identical (collectAll doesn't increase
                            volume in this fixture)
  ADD_ROW (100 rows):      ~no change (collection-level changedPath
                            overlaps every row's path → effectively
                            a full walk anyway)

Per-keystroke walker code: ON ~3.3ms vs OFF ~7.3ms. The dominant
cost (options evaluation) drops by 84-86%; the small validateSchema
regression from removing short-circuit is dominated by the wins.

The infrastructure (audit harness, multi-path tracker, canary
resilience) DID NOT cost us perf — it shipped alongside a net win.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 3, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 158c3d0b-7971-4d8e-b50c-7acac411491c

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

asheshv added 8 commits June 3, 2026 13:06
Three CI fixes from the first run of PR pgadmin-org#10002:

  1. ESLint indent + no-console + display-name + import-rule
     Local dev passed because yarn run test:js-once linted on the
     way through but the runner caught extra cases on a clean
     clone. 78 errors → 0.
       - audit_harness.js rotation loop had inconsistent indent
         around the inner combo/extras block (~70 lines)
       - perf-bench/ scripts use console.log to emit per-keystroke
         numbers; they're diagnostic dev scripts, added the
         directory to .eslintrc.js's ignores
       - react-resize-detector mock's withResizeDetector HOC
         lacked displayName on the wrapped component
       - perf.js had a stale 'import/no-unassigned-import' disable
         comment referencing a rule the project doesn't load

  2. schemaview-ui-smoke 'Start PostgreSQL' step failure
     ubuntu-22.04 runners come with an older PostgreSQL pre-
     installed; the new install of pg16 didn't recreate
     /etc/postgresql/16/main/ when a conflicting cluster was
     present. Added a 'Uninstall any pre-installed PostgreSQL'
     step mirroring run-feature-tests-pg.yml's pattern, runs
     before the pg16 install.

  3. Cascade: run-javascript-tests (3 OS), run-feature-tests-pg
     (5 versions), and build-container all failed on the same
     lint preflight as their first step. Fix pgadmin-org#1 cascade-fixes
     them all.

verify-canary-treeshake passed on the first CI run; no changes
needed there.

All 1288 Jest tests still pass locally.
# Conflicts:
#	web/pgadmin/static/js/SchemaView/options/registry.js
Locally replayed the UI smoke workflow against an isolated SQLite
+ the existing pgAdmin codebase and confirmed:

  - JSON server format (Servers / Name / Group / Port / Username /
    Host / MaintenanceDB / SSLMode) is the right shape.
  - load-servers requires the SQLite to already exist; bare
    workflow run errored 'SQLite database file does not exist'
    before adding any entries.
  - setup.py setup-db creates the DB cleanly given a valid
    config_local.py with SQLITE_PATH set.

Added a 'Initialise pgAdmin config DB' step that runs setup-db
between config_local.py creation and load-servers. With the fix
in place, the local replay completes all 5 UI smoke tests in
~27s — same shape as a normal local run.
CI run #2 result: 3 of 5 smoke specs passed (Register Server,
Create Table, Create Function); the two edit-mode specs failed
with 'openEditDialogViaApi: no child of type "table"/"function"
found under selected node — does the database have any?'

CI's fresh PG 16 starts with an empty postgres database — no
user tables, no user functions. The smoke's openEditDialogViaApi
helper looks for the first child of the requested type and opens
Properties on it; with no children, it correctly throws.

Locally this worked because my dev PG has objects in the test
database. In CI we need to pre-create them.

Added a 'Seed schema objects' step that runs a single psql with:
  - CREATE TABLE IF NOT EXISTS audit_smoke_table (id, name)
  - CREATE OR REPLACE FUNCTION audit_smoke_func() RETURNS int

The 'IF NOT EXISTS' / 'OR REPLACE' make the step idempotent so
re-runs on the same runner don't error.

All other 14 jobs went green in the prior run; this should bring
schemaview-ui-smoke green too.
13 lint warnings on the latest CI run, all about unused vars. Three
fixes:

  1. catch (_e) → catch  (10 sites)
     ES2019 optional catch binding. The error is genuinely unused
     in each site:
       - canary.js, validation_canary.js: sendBeacon swallow
         (documented browser quirk; silent telemetry path).
       - audit_harness.js tryInstantiate fall-through: next
         attempt's outcome replaces the error.
       - audit_harness.js seedCollections / ADD_ROW / sequence
         / applyMutation: per-step skips already covered by other
         passes; failure mode is captured via null returns or
         skip-result fields.
       - audit_harness.js auditSchema initial validate:
         dispatch loop's own try/catch catches the same error
         with the field path context.
       - setup-jest.js AMD define + audit_harness import:
         best-effort module loading; failures are expected for
         non-SchemaView Jest workers.

  2. catch → catch (e) WITH the error captured in fuzzBatchAgainst
     initial validate (one site, audit_harness.js ~1263). When the
     fuzzer's mount-time validate throws, the previous swallow
     reported only 'initial validate threw'. fast-check's shrinker
     would land on a minimal counterexample like (TableSchema,
     create, [0, 1]) with no clue WHY. Now the reason includes
      so the shrunk reproducer names
     the real failure.

  3. Removed dead code in batched_changed_paths.spec.jsx
     buildState helper + the SchemaState import that fed it. Both
     were leftovers from an earlier iteration that built
     SchemaState directly; the spec switched to a Harness
     component using useSchemaState end-to-end. The two test()
     cases now reference Harness exclusively.

CI scoreboard:
  Before this commit: 1289 tests pass, 13 lint warnings.
  After:              1289 tests pass, 0 lint warnings, 0 errors.
Replaces previously-introduced lint suppressions with actual fixes:

1. ESLint config now declares the genuine project globals
   ----------------------------------------------------
   pgAdmin (webpack ProvidePlugin global, used in bench-fixture.js)
   and expect (Jest global, used at module scope in setup-jest.js's
   afterEach) added to the eslintrc globals block. Three
   eslint-disable-next-line no-undef markers removed in turn.

2. perf-bench directory ignore -> per-file disables
   ------------------------------------------------
   Dropped the broad regression/perf-bench/ ignore from
   .eslintrc.js. Replaced with file-level
   /* eslint-disable no-console */ headers ONLY in the two specs
   that actually need them (nested.spec.js, datagridview.spec.js).
   audit-helpers.js uses only console.error which the rule
   already allows — no disable needed.

   File-level disables are more honest than a directory ignore:
   a maintainer reading the spec sees consent for THAT file, not
   a blanket carve-out hidden in the config.

3. perf.js: 5 line-level disables -> one block-level disable
   ----------------------------------------------------------
   The dump() function uses console.table for tabular output —
   the no-console rule's allowed warn/error/trace can't express
   it. Switched 5 // eslint-disable-next-line no-console comments
   to a single /* eslint-disable no-console */ /* eslint-enable */
   block around the function body. Rest of perf.js still under
   the rule.

4. Stale openTreeContextMenu helper removed
   ----------------------------------------
   The directory ignore was hiding an unused-vars warning on
   openTreeContextMenu in audit-helpers.js — leftover from before
   I switched UI smoke to the JS tree API. Deleted (8 lines).
   navigateToCatalogNodeViaApi + openCreateDialogViaApi / Edit
   variant superseded it.

5. Empty catch block in nested.spec.js
   ----------------------------------
   try { ... } catch {} -> catch { /* try the next candidate */ }
   for the cancel-dialog candidate selector loop. The block is
   intentionally empty; the comment now says so explicitly.

Net change:
  Before: 11 eslint-disable directives + 1 directory ignore
          + 13 'warning' findings in CI (the original cleanup)
  After:  2 eslint-disable markers total (perf.js dump block,
          bench-fixture.js mount-log — both with explicit
          comments explaining WHY), 0 directory ignores added
          by this PR, 0 lint warnings, 0 errors.

All 1289 Jest tests still pass.
…y reset

Three honest improvements over per-PR suppressions / shortcuts.

  1. scripts/ directory ignore removed from .eslintrc.js
  -----------------------------------------------------
  The 'scripts/' entry was added by this PR (commit d22a249)
  to hide codemod-register-schema.js from lint. With it gone,
  the codemod surfaces 8 real issues:
    - 1 string-quote violation (autofixed)
    - 7 console.* statements (CLI report mechanism — legitimate
      for a one-shot script; gated by a single file-level
      /* eslint-disable no-console */ marker with an explicit
      comment noting it's the script's output channel)
  Net change: 1 directory ignore + 0 disables → 0 directory
  ignores + 1 well-commented file-level disable.

  2. auditNestedFields recurses through ALL depths
  ------------------------------------------------
  Previously dispatched against scalars only 1 level into a
  nested-fieldset / nested-tab / inline-groups container. Real
  schemas chain group containers multiple levels deep — e.g.
  TableSchema's Storage block, IndexSchema's With block — and
  deep scalars went unexercised.

  New walkNestedScalars(rootSchema) generator yields every
  scalar reachable through the group hierarchy, bounded by
  MAX_NEST_DEPTH=6 (generous for production shapes, guards
  against pathological / cyclical schemas).

  Note: the walker doesn't prune within nested-fieldset (all
  fields in a group are always walked), so multi-level
  divergences aren't a likely bug class today. The value is
  coverage — dispatching on deep scalars in case a future walker
  change adds pruning, or in case a deep closure has its own bug.
  New synthetic test (3-level L1->L2->L3) verifies the recursion
  fires at every depth by dispatch count.

  3. setup-jest: lazy mutation-counter capture via require.cache
  --------------------------------------------------------------
  Pre-fix: setup-jest required audit_harness at module load on
  every Jest worker (including ~80% non-SchemaView workers).
  Module-load require chosen because requiring inside beforeEach
  trips the zustand-mock's top-level afterEach() registration.

  Post-fix: resolve the audit_harness path once at module load
  (require.resolve only — doesn't load the module), then check
  require.cache[path] inside beforeEach. If a SchemaView spec
  has already loaded the harness, capture _resetMutationCounter
  from the cached exports; otherwise no-op. Non-SchemaView
  workers do one property lookup per beforeEach and skip — no
  audit_harness in their require graph.

Test scoreboard:
  Before: 1289 tests in 75s
  After:  1290 tests in 75s  (+1 from the depth-3 recursion
                              regression-lock)

Perf re-benched post-rebase at 10 × 1000:
  Outer typing  ON 60 / OFF 231 ms/key  → 3.85x
  Inner typing  ON 174 / OFF 351 ms/key → 2.02x
  Inner ADD_ROW ON 566 / OFF 696 ms     → 1.23x
PR description's 3.79x / 2.00x figures still accurate (slight
improvement within run-to-run noise).
Two correctness/safety improvements surfaced by an aggressive
post-build review of this PR.

  Iter 1: depDests now includes knownErrorPaths (LOW-severity gap)
  --------------------------------------------------------------
  updateOptions(changedPath, depDests) was called with depDests
  built from { changedPath, extras, extras' deps }. The mustVisit
  for validateSchema ALSO included _knownErrorPaths.values(), but
  depDests passed to the options walker DID NOT.

  Latent inconsistency: the options walker could prune a row that
  previously had a validation error, miscomputing its options if
  any closure read top.errors / row error state. The canary would
  catch divergence here, but adding the paths is the correct fix.

  Iter 3: per-pass try/catch so one closure crash doesn't bury 7 passes
  -------------------------------------------------------------------
  Pre-fix: a SINGLE outer try around all 8 audit passes. If any
  pass threw a non-divergence error (closure crash on missing
  nodeInfo etc.), the entire chain aborted — remaining passes
  never ran. CompoundTriggerSchema, ViewSchema, SubscriptionSchema
  were affected: 4 SKIP entries in the registered_schemas_audit
  output where the schema was getting ZERO audit coverage.

  Refactored to runPass(label, fn) wrapper that catches per-pass.
  Divergences still propagate (the isDivergenceError check is
  preserved). Non-divergence errors aggregate into a passSkips
  array; the audit reports skip ONLY when zero passes contributed
  dispatches — otherwise the schema gets PARTIAL coverage from the
  passes that ran.

  Result: 4 SKIPs → 2 SKIPs. CompoundTriggerSchema now gets
  partial coverage (previously 0); ViewSchema and SubscriptionSchema
  still skip because EVERY pass throws on those (genuine harness
  fixture limitation, not a regression).

Test scoreboard:
  Jest:  167 / 167 suites, 1290 / 1290 tests passing (unchanged
         count — these are quality fixes, not new tests)
  Audit: 261 (87 schemas × 3 modes) — 260 passing tests + 1
         discovery (was 260 passing, 0 visible regression in
         net count, but better partial-coverage shape)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant