Skip to content

feat(iv): OperationRepo IV runtime — pre-HYDRATE deferral, FAIL_UNAUTHORIZED handler, IdentityVerificationService (3/6)#2625

Open
nan-li wants to merge 2 commits intofeat/iv-oprepo-gating-02from
feat/iv-queue-runtime-03
Open

feat(iv): OperationRepo IV runtime — pre-HYDRATE deferral, FAIL_UNAUTHORIZED handler, IdentityVerificationService (3/6)#2625
nan-li wants to merge 2 commits intofeat/iv-oprepo-gating-02from
feat/iv-queue-runtime-03

Conversation

@nan-li
Copy link
Copy Markdown
Contributor

@nan-li nan-li commented Apr 24, 2026

Description

One Line Summary

PR 3 of 6 against the Identity Verification integration branch — wires queue-level IV behavior (pre-HYDRATE deferral, FAIL_UNAUTHORIZED handler, anonymous-op purge) on top of PR 2's Operation.externalId base-class refactor.

Details

Motivation

PR 1 landed the gate state (IdentityVerificationGates) and JWT store; PR 2 gave every operation an externalId. This PR is where IV starts to actually do something at runtime. It lands the three behavioral pieces of OperationRepo + IdentityVerificationService that are necessary before the public API can be added:

  1. Pre-HYDRATE deferral — the SDK can't dispatch ops until it knows whether jwt_required is true or false, otherwise a Phase 2 user's first-launch ops would race past the JWT gate and 401.
  2. FAIL_UNAUTHORIZED handling — without this, a 401 drops the op and the developer has no way to recover; the SDK needs to notify the app so it can supply a fresh JWT, then retry.
  3. Anonymous-op purge — when IV becomes active, ops queued under the previous anonymous user can't authenticate and would otherwise block the queue indefinitely.

Scope

IOperationRepo gains two methods (both used only by internal SDK components — external callers don't see these change on the public surface):

  • fun setJwtInvalidatedHandler(handler: ((externalId: String) -> Unit)?) — single-handler setter, set once by PR 5's UserManager with a lambda that fires the developer-facing listener EventProducer. Intentionally not an EventProducer here: there's exactly one internal subscriber.
  • fun removeOperationsWithoutExternalId() — drops anon ops from both the in-memory queue and the persistent OperationModelStore so they don't resurrect on cold start.

OperationRepo:

  • Injects JwtTokenStore directly. We considered a narrow IJwtStore interface defined in core (implemented by user-layer JwtTokenStore) to preserve core→user layering, but found OperationRepo already imports user-layer classes directly (e.g. LoginUserOperation, NewRecordsState), and 34 core files import from user overall. The core/user split is organizational, not architectural, so adding an abstraction for this one case would be inconsistent with the codebase.
  • Pre-HYDRATE deferral at the top of getNextOps:
    if (_configModelStore.model.useIdentityVerification == JwtRequirement.UNKNOWN) return null
    Unconditional — applies even to Phase 1 non-IV users — because pre-HYDRATE we can't distinguish phases. First-launch delay is typically <2 s; useIdentityVerification is cached in ConfigModel after the first successful params fetch so subsequent launches skip this.
  • Outer newCodePathsRun dispatch at the getNextOps filter and the FAIL_UNAUTHORIZED executor-result branch. Ops on the old code path (feature flag off, jwt_required=false) are unchanged. Split FAIL_UNAUTHORIZED out of the combined FAIL_NORETRY/FAIL_CONFLICT branch because its handling is now conditional.

New OperationRepoIvExtensions.kt (side-by-side with OperationRepo.kt per the PR plan's Option A file layout):

  • hasValidJwtIfRequired(jwtTokenStore, op) — inner ivBehaviorActive guard short-circuits to true when IV behavior is off, so Phase 3 users (new code paths on, IV off) run through the dispatch site but no actual JWT gating happens. Returns false for anonymous ops when IV is active; otherwise checks jwtTokenStore.getJwt(op.externalId) != null.
  • handleFailUnauthorized(startingOp, ops, jwtTokenStore, handler) — same inner gate. Returns false when IV inactive or op is anon so the caller falls through to the default drop-on-fail path. When handled: invalidates the JWT, fires the handler before waking waiters (otherwise an enqueueAndWait caller could return before the handler propagates to the app), wakes waiters with false, and re-queues ops without their waiters so they can retry once a fresh JWT arrives.

Both extensions take their dependencies as parameters rather than accessing OperationRepo private fields — keeps encapsulation intact and the extensions pure-ish.

New IdentityVerificationService (DI-registered IStartableService):

  • Subscribes to ConfigModelStore; reacts only to HYDRATE.
  • Schedules the purge via suspendifyOnIO { operationRepo.awaitInitialized(); ... }fix for an earlier race where anon-op purge ran against an empty in-memory queue on cold start because it fired before loadSavedOperations completed.
  • Purges anon ops only when the new useIdentityVerification is REQUIRED. Always calls forceExecuteOperations to wake the queue from the pre-HYDRATE deferral regardless of the new state.

CoreModule registers IdentityVerificationService as an IStartableService.

Two-layer gating

The memory-plan design is preserved:

  • Outer gate (dispatch site, OperationRepo.kt): if (IdentityVerificationGates.newCodePathsRun) extensionFunction(...). When off, old code path runs untouched.
  • Inner gate (extension body): if (!IdentityVerificationGates.ivBehaviorActive) return <no-op>. Lets Phase 3 users (new code paths on, IV off) run through the new dispatch without any actual IV gating applied — this is how we validate the new code paths structurally before flipping IV active for each customer.

What is NOT in this PR

  • Public API (OneSignal.login(externalId, jwt), updateUserJwt, JWT-invalidated listeners) — PR 4.
  • HTTP layer Authorization: Bearer header attachment — PR 5.
  • Logout IV-aware behavior + IAM integration — PR 6.
  • JwtTokenStore.pruneToExternalIds() invocation on cold start — deferred; not needed yet.

Testing

Unit testing

11 new tests, 789 existing tests still pass (2 pre-existing SDKInitTests failures on the integration branch are unrelated — same 2 that fail on feat/iv-foundation-01 / feat/iv-oprepo-gating-02).

New IdentityVerificationServiceTests (5):

  • start subscribes to ConfigModelStore.
  • HYDRATE with REQUIRED purges anon ops then wakes the queue (after awaitInitialized) — uses coVerifyOrder to assert the purge runs after awaitInitialized(), which is the whole point of the race fix.
  • HYDRATE with NOT_REQUIRED wakes the queue but does NOT purge.
  • HYDRATE with UNKNOWN wakes the queue (no purge).
  • non-HYDRATE model replacement is ignored.

New OperationRepoTests (6):

  • Pre-HYDRATE deferral blocks execution (queue retains op; executor not called).
  • Flipping useIdentityVerification to NOT_REQUIRED and calling forceExecuteOperations unblocks the queue.
  • removeOperationsWithoutExternalId filters the queue AND removes the persisted ops from the store.
  • setJwtInvalidatedHandler round-trip via internal accessor.
  • FAIL_UNAUTHORIZED with IV active: invalidates JWT, fires handler, re-queues ops (not dropped), wakes enqueueAndWait caller with false.
  • FAIL_UNAUTHORIZED with IV inactive: handler NOT fired, default drop-on-fail path runs (op removed from store).

Each IV test scopes its IdentityVerificationGates mutations in try/finally to reset the singleton; Mocks() sets configModel.useIdentityVerification = JwtRequirement.NOT_REQUIRED by default so the pre-HYDRATE deferral doesn't hold up tests that aren't exercising IV state.

Manual testing

Not applicable for this PR — no runtime behavior change for Phase 1 users (feature flag off, jwt_required eventually false) once HYDRATE completes; IV-specific behavior is gated off pending the rest of the rollout series. The pre-HYDRATE deferral is covered by unit tests.

Affected code checklist

  • Notifications
  • Outcomes
  • Sessions
  • In-App Messaging
  • REST API requests (ops can now be held/re-queued/dropped based on IV state; no change to request shapes)
  • Public API changes

Checklist

Overview

  • I have filled out all REQUIRED sections above
  • PR does one thing — wires queue-level IV behavior onto the refactor from PR 2
  • Any Public API changes are explained in the PR details (none in this PR)

Testing

  • I have included test coverage for these changes
  • All automated tests pass locally (pre-existing SDKInitTests failures are unrelated — same 2 on integration branch)
  • Manual testing N/A (IV-specific behavior is gated off pending PR 4+ wiring)

Final pass

  • Code is as readable as possible.
  • I have reviewed this PR myself, ensuring it meets each checklist item

@nan-li nan-li force-pushed the feat/iv-oprepo-gating-02 branch from 0caf0f3 to aec3996 Compare April 24, 2026 19:55
…HORIZED handler, IdentityVerificationService

PR 3 of 6 against the Identity Verification integration branch — wires queue-
level IV behavior on top of PR 2's `Operation.externalId` refactor.

Interface additions on `IOperationRepo`:
- `setJwtInvalidatedHandler((externalId: String) -> Unit)?)` — single-handler
  setter; PR 5's UserManager will wire in a lambda that fires the public
  listener EventProducer.
- `removeOperationsWithoutExternalId()` — drops anon ops from both in-memory
  queue and the persistent store, so they don't resurrect on cold start.

`OperationRepo`:
- Injects `JwtTokenStore` (no cross-package-interface abstraction since core
  already imports user-layer classes elsewhere — 34 existing imports make
  the layering boundary organizational, not architectural).
- Pre-HYDRATE deferral at the top of `getNextOps`: when
  `useIdentityVerification == JwtRequirement.UNKNOWN`, return null so the
  loop waits. Applies unconditionally because we can't tell Phase 1 / 2 / 3
  apart pre-HYDRATE; brief first-launch delay for non-IV users is tolerable
  (typically <2s, cached for subsequent launches).
- Outer `newCodePathsRun` dispatch at `getNextOps` filter and the
  FAIL_UNAUTHORIZED executor-result branch. IV-specific logic lives in a new
  `OperationRepoIvExtensions.kt`.

`OperationRepoIvExtensions.kt` (side-by-side with OperationRepo per the PR plan):
- `hasValidJwtIfRequired(jwtTokenStore, op)`: inner `ivBehaviorActive` check
  short-circuits to `true` when IV is inactive (Phase 3 runs the new paths
  without actual JWT gating). Otherwise checks `requiresJwt`, then presence
  of a JWT for the op's `externalId`.
- `handleFailUnauthorized(startingOp, ops, jwtTokenStore, handler)`: when IV
  is active and the op has an externalId, invalidates the JWT, fires the
  handler, wakes waiters with `false`, and re-queues ops without their
  waiters (so they can retry once a fresh JWT arrives). Returns `false`
  when IV inactive or op is anon, so the base class falls through to the
  default drop-on-fail path. Handler fires BEFORE waiters wake so
  `enqueueAndWait` callers can't return before the handler propagates.

New class `IdentityVerificationService`:
- Subscribes to `ConfigModelStore`; reacts only to HYDRATE.
- Schedules the purge via `suspendifyOnIO` + `awaitInitialized()` — fix for
  an earlier race where anon-op purge ran against an empty in-memory queue
  on cold start because it fired before `loadSavedOperations` completed.
- When the new state is `JwtRequirement.REQUIRED`, purges anon ops. Always
  calls `forceExecuteOperations` to wake the queue from the pre-HYDRATE
  deferral.

Tests (+11):
- 5 `IdentityVerificationServiceTests`: subscribe behavior, HYDRATE with each
  requirement (REQUIRED purges + wakes, NOT_REQUIRED wakes only, UNKNOWN
  wakes only), non-HYDRATE tag ignored. `coVerifyOrder` asserts purge runs
  after `awaitInitialized()` — the race-fix invariant.
- 6 new OperationRepoTests: pre-HYDRATE deferral (op not executed, queue
  preserved); flip-to-NOT_REQUIRED unblocks; `removeOperationsWithoutExternalId`
  filters queue and persists removal; `setJwtInvalidatedHandler` round-trip;
  FAIL_UNAUTHORIZED with IV active invalidates JWT + fires handler + re-queues;
  FAIL_UNAUTHORIZED with IV inactive falls through to default drop.

Existing 791 tests still pass (2 pre-existing `SDKInitTests` failures on the
integration branch remain; unrelated).
@nan-li nan-li force-pushed the feat/iv-queue-runtime-03 branch from 9f21c64 to 89c5d54 Compare April 24, 2026 20:00
… UNKNOWN

Bot caught a regression: deferral keyed on `useIdentityVerification == UNKNOWN`
permanently stalls the queue whenever the backend response omits
`require_ident_auth` (older deployments, dev/test environments, partial
rollouts — the `TODO: New` comment in ParamsBackendService explicitly flags
this field as newly-added). HYDRATE flips `isInitializedWithRemote` to true
but leaves `useIdentityVerification` at UNKNOWN; `getNextOps` re-defers on
every wake and ops never dispatch.

Switch the gate to `!isInitializedWithRemote` — set unconditionally by
`ConfigModelStoreListener` on any successful HYDRATE, so it correctly signals
"HYDRATE has completed" regardless of which keys the backend returned. Post-
HYDRATE with a silent backend, `useIdentityVerification` stays UNKNOWN and
`ivBehaviorActive` is false, so ops flow through the old code path normally
(no JWT gating applied, matching Phase 1 behavior).

Tests:
- Updated both existing pre-HYDRATE tests to flip `isInitializedWithRemote`.
- New regression-guard test: post-HYDRATE with backend silent on the field
  (isInitializedWithRemote=true, useIdentityVerification=UNKNOWN) dispatches
  ops instead of deadlocking.
- `MockHelper.configModelStore` defaults (in the core-module test helpers)
  now also set `isInitializedWithRemote=true` so tests that don't exercise
  pre-HYDRATE aren't accidentally held up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@nan-li
Copy link
Copy Markdown
Contributor Author

nan-li commented Apr 24, 2026

@claude review

Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My prior concern about the pre-HYDRATE deferral deadlocking when the backend omits require_ident_auth is addressed — the gate is now on isInitializedWithRemote and there's a regression test for the backend-silent case. No new bugs found on this pass, but this PR is complex and touches identity-verification/auth-sensitive queue runtime across several files, so a human review is still warranted.

Extended reasoning...

Overview

PR 3 of 6 in the Identity Verification rollout, wiring queue-level IV behavior onto OperationRepo. Changes span: (1) a pre-HYDRATE deferral at the top of getNextOps so the queue waits until remote params are fetched before dispatching ops, (2) a new FAIL_UNAUTHORIZED handler path that invalidates the JWT, fires a developer-facing handler, and re-queues ops (when IV is active), (3) removeOperationsWithoutExternalId to purge anon ops from both the in-memory queue and the persistent OperationModelStore, (4) a new IdentityVerificationService IStartableService that subscribes to ConfigModelStore HYDRATE events and schedules purge + wake via suspendifyOnIO + awaitInitialized to fix a cold-start race, and (5) outer/inner two-layer gating on IdentityVerificationGates.newCodePathsRun and ivBehaviorActive. 11 new unit tests plus a regression test for the backend-silent deadlock case I previously flagged.

Security risks

This PR is squarely in auth-sensitive territory: it changes when and how operations are dispatched based on JWT/IV state, adds a handler for 401 Unauthorized responses, and introduces anonymous-op purging. The handler-fires-before-waiter-wake ordering in handleFailUnauthorized is load-bearing for correctness of enqueueAndWait callers. The two-layer gating (newCodePathsRun outer + ivBehaviorActive inner) is a risk-management mechanism for staged rollout — if either gate is miscomputed, Phase 1/3 users could see IV behavior they shouldn't. The pre-HYDRATE deferral is now correctly gated on isInitializedWithRemote rather than the IV param itself, so no deadlock on silent-backend responses.

Level of scrutiny

High. This touches the core operation dispatch path that every network-bound SDK action flows through (push subscription creates, tag updates, logins, outcomes, etc.). Regressions here could silently stall the entire queue for an app. The IV-specific logic is also directly on the auth/identity path. My prior bug report was the kind of issue that would have slipped past the existing test suite (an UNKNOWN test existed but only asserted forceExecuteOperations was called, not that any op actually dispatched). The new post-HYDRATE with backend silent on require_ident_auth test plus the semantic shift to isInitializedWithRemote closes that hole properly.

Other factors

The PR description is thorough and accurate — it correctly calls out the rationale for taking JwtTokenStore directly rather than abstracting behind an IJwtStore interface, and it explains the layering/Phase matrix clearly. Test coverage is strong for the new paths (6 OperationRepoTests, 5 IdentityVerificationServiceTests). The two pre-existing SDKInitTests failures noted as unrelated have been consistently failing on the integration branch across prior PRs. A code owner (nan-li) is the one who pinged me to re-review, signaling they're already engaged. Given the scope, complexity, cross-PR coupling, and the security-sensitive nature of the surface area, this is exactly the kind of change that benefits from a human code owner's review rather than bot approval.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant