feat(iv): OperationRepo IV runtime — pre-HYDRATE deferral, FAIL_UNAUTHORIZED handler, IdentityVerificationService (3/6)#2625
Conversation
0caf0f3 to
aec3996
Compare
…HORIZED handler, IdentityVerificationService PR 3 of 6 against the Identity Verification integration branch — wires queue- level IV behavior on top of PR 2's `Operation.externalId` refactor. Interface additions on `IOperationRepo`: - `setJwtInvalidatedHandler((externalId: String) -> Unit)?)` — single-handler setter; PR 5's UserManager will wire in a lambda that fires the public listener EventProducer. - `removeOperationsWithoutExternalId()` — drops anon ops from both in-memory queue and the persistent store, so they don't resurrect on cold start. `OperationRepo`: - Injects `JwtTokenStore` (no cross-package-interface abstraction since core already imports user-layer classes elsewhere — 34 existing imports make the layering boundary organizational, not architectural). - Pre-HYDRATE deferral at the top of `getNextOps`: when `useIdentityVerification == JwtRequirement.UNKNOWN`, return null so the loop waits. Applies unconditionally because we can't tell Phase 1 / 2 / 3 apart pre-HYDRATE; brief first-launch delay for non-IV users is tolerable (typically <2s, cached for subsequent launches). - Outer `newCodePathsRun` dispatch at `getNextOps` filter and the FAIL_UNAUTHORIZED executor-result branch. IV-specific logic lives in a new `OperationRepoIvExtensions.kt`. `OperationRepoIvExtensions.kt` (side-by-side with OperationRepo per the PR plan): - `hasValidJwtIfRequired(jwtTokenStore, op)`: inner `ivBehaviorActive` check short-circuits to `true` when IV is inactive (Phase 3 runs the new paths without actual JWT gating). Otherwise checks `requiresJwt`, then presence of a JWT for the op's `externalId`. - `handleFailUnauthorized(startingOp, ops, jwtTokenStore, handler)`: when IV is active and the op has an externalId, invalidates the JWT, fires the handler, wakes waiters with `false`, and re-queues ops without their waiters (so they can retry once a fresh JWT arrives). Returns `false` when IV inactive or op is anon, so the base class falls through to the default drop-on-fail path. Handler fires BEFORE waiters wake so `enqueueAndWait` callers can't return before the handler propagates. New class `IdentityVerificationService`: - Subscribes to `ConfigModelStore`; reacts only to HYDRATE. - Schedules the purge via `suspendifyOnIO` + `awaitInitialized()` — fix for an earlier race where anon-op purge ran against an empty in-memory queue on cold start because it fired before `loadSavedOperations` completed. - When the new state is `JwtRequirement.REQUIRED`, purges anon ops. Always calls `forceExecuteOperations` to wake the queue from the pre-HYDRATE deferral. Tests (+11): - 5 `IdentityVerificationServiceTests`: subscribe behavior, HYDRATE with each requirement (REQUIRED purges + wakes, NOT_REQUIRED wakes only, UNKNOWN wakes only), non-HYDRATE tag ignored. `coVerifyOrder` asserts purge runs after `awaitInitialized()` — the race-fix invariant. - 6 new OperationRepoTests: pre-HYDRATE deferral (op not executed, queue preserved); flip-to-NOT_REQUIRED unblocks; `removeOperationsWithoutExternalId` filters queue and persists removal; `setJwtInvalidatedHandler` round-trip; FAIL_UNAUTHORIZED with IV active invalidates JWT + fires handler + re-queues; FAIL_UNAUTHORIZED with IV inactive falls through to default drop. Existing 791 tests still pass (2 pre-existing `SDKInitTests` failures on the integration branch remain; unrelated).
9f21c64 to
89c5d54
Compare
… UNKNOWN Bot caught a regression: deferral keyed on `useIdentityVerification == UNKNOWN` permanently stalls the queue whenever the backend response omits `require_ident_auth` (older deployments, dev/test environments, partial rollouts — the `TODO: New` comment in ParamsBackendService explicitly flags this field as newly-added). HYDRATE flips `isInitializedWithRemote` to true but leaves `useIdentityVerification` at UNKNOWN; `getNextOps` re-defers on every wake and ops never dispatch. Switch the gate to `!isInitializedWithRemote` — set unconditionally by `ConfigModelStoreListener` on any successful HYDRATE, so it correctly signals "HYDRATE has completed" regardless of which keys the backend returned. Post- HYDRATE with a silent backend, `useIdentityVerification` stays UNKNOWN and `ivBehaviorActive` is false, so ops flow through the old code path normally (no JWT gating applied, matching Phase 1 behavior). Tests: - Updated both existing pre-HYDRATE tests to flip `isInitializedWithRemote`. - New regression-guard test: post-HYDRATE with backend silent on the field (isInitializedWithRemote=true, useIdentityVerification=UNKNOWN) dispatches ops instead of deadlocking. - `MockHelper.configModelStore` defaults (in the core-module test helpers) now also set `isInitializedWithRemote=true` so tests that don't exercise pre-HYDRATE aren't accidentally held up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@claude review |
There was a problem hiding this comment.
My prior concern about the pre-HYDRATE deferral deadlocking when the backend omits require_ident_auth is addressed — the gate is now on isInitializedWithRemote and there's a regression test for the backend-silent case. No new bugs found on this pass, but this PR is complex and touches identity-verification/auth-sensitive queue runtime across several files, so a human review is still warranted.
Extended reasoning...
Overview
PR 3 of 6 in the Identity Verification rollout, wiring queue-level IV behavior onto OperationRepo. Changes span: (1) a pre-HYDRATE deferral at the top of getNextOps so the queue waits until remote params are fetched before dispatching ops, (2) a new FAIL_UNAUTHORIZED handler path that invalidates the JWT, fires a developer-facing handler, and re-queues ops (when IV is active), (3) removeOperationsWithoutExternalId to purge anon ops from both the in-memory queue and the persistent OperationModelStore, (4) a new IdentityVerificationService IStartableService that subscribes to ConfigModelStore HYDRATE events and schedules purge + wake via suspendifyOnIO + awaitInitialized to fix a cold-start race, and (5) outer/inner two-layer gating on IdentityVerificationGates.newCodePathsRun and ivBehaviorActive. 11 new unit tests plus a regression test for the backend-silent deadlock case I previously flagged.
Security risks
This PR is squarely in auth-sensitive territory: it changes when and how operations are dispatched based on JWT/IV state, adds a handler for 401 Unauthorized responses, and introduces anonymous-op purging. The handler-fires-before-waiter-wake ordering in handleFailUnauthorized is load-bearing for correctness of enqueueAndWait callers. The two-layer gating (newCodePathsRun outer + ivBehaviorActive inner) is a risk-management mechanism for staged rollout — if either gate is miscomputed, Phase 1/3 users could see IV behavior they shouldn't. The pre-HYDRATE deferral is now correctly gated on isInitializedWithRemote rather than the IV param itself, so no deadlock on silent-backend responses.
Level of scrutiny
High. This touches the core operation dispatch path that every network-bound SDK action flows through (push subscription creates, tag updates, logins, outcomes, etc.). Regressions here could silently stall the entire queue for an app. The IV-specific logic is also directly on the auth/identity path. My prior bug report was the kind of issue that would have slipped past the existing test suite (an UNKNOWN test existed but only asserted forceExecuteOperations was called, not that any op actually dispatched). The new post-HYDRATE with backend silent on require_ident_auth test plus the semantic shift to isInitializedWithRemote closes that hole properly.
Other factors
The PR description is thorough and accurate — it correctly calls out the rationale for taking JwtTokenStore directly rather than abstracting behind an IJwtStore interface, and it explains the layering/Phase matrix clearly. Test coverage is strong for the new paths (6 OperationRepoTests, 5 IdentityVerificationServiceTests). The two pre-existing SDKInitTests failures noted as unrelated have been consistently failing on the integration branch across prior PRs. A code owner (nan-li) is the one who pinged me to re-review, signaling they're already engaged. Given the scope, complexity, cross-PR coupling, and the security-sensitive nature of the surface area, this is exactly the kind of change that benefits from a human code owner's review rather than bot approval.
Description
One Line Summary
PR 3 of 6 against the Identity Verification integration branch — wires queue-level IV behavior (pre-HYDRATE deferral, FAIL_UNAUTHORIZED handler, anonymous-op purge) on top of PR 2's
Operation.externalIdbase-class refactor.Details
Motivation
PR 1 landed the gate state (
IdentityVerificationGates) and JWT store; PR 2 gave every operation anexternalId. This PR is where IV starts to actually do something at runtime. It lands the three behavioral pieces ofOperationRepo+IdentityVerificationServicethat are necessary before the public API can be added:jwt_requiredis true or false, otherwise a Phase 2 user's first-launch ops would race past the JWT gate and 401.FAIL_UNAUTHORIZEDhandling — without this, a 401 drops the op and the developer has no way to recover; the SDK needs to notify the app so it can supply a fresh JWT, then retry.Scope
IOperationRepogains two methods (both used only by internal SDK components — external callers don't see these change on the public surface):fun setJwtInvalidatedHandler(handler: ((externalId: String) -> Unit)?)— single-handler setter, set once by PR 5'sUserManagerwith a lambda that fires the developer-facing listenerEventProducer. Intentionally not anEventProducerhere: there's exactly one internal subscriber.fun removeOperationsWithoutExternalId()— drops anon ops from both the in-memory queue and the persistentOperationModelStoreso they don't resurrect on cold start.OperationRepo:JwtTokenStoredirectly. We considered a narrowIJwtStoreinterface defined in core (implemented by user-layerJwtTokenStore) to preserve core→user layering, but foundOperationRepoalready imports user-layer classes directly (e.g.LoginUserOperation,NewRecordsState), and 34 core files import from user overall. The core/user split is organizational, not architectural, so adding an abstraction for this one case would be inconsistent with the codebase.getNextOps:useIdentityVerificationis cached inConfigModelafter the first successful params fetch so subsequent launches skip this.newCodePathsRundispatch at thegetNextOpsfilter and theFAIL_UNAUTHORIZEDexecutor-result branch. Ops on the old code path (feature flag off,jwt_required=false) are unchanged. SplitFAIL_UNAUTHORIZEDout of the combinedFAIL_NORETRY/FAIL_CONFLICTbranch because its handling is now conditional.New
OperationRepoIvExtensions.kt(side-by-side withOperationRepo.ktper the PR plan's Option A file layout):hasValidJwtIfRequired(jwtTokenStore, op)— innerivBehaviorActiveguard short-circuits totruewhen IV behavior is off, so Phase 3 users (new code paths on, IV off) run through the dispatch site but no actual JWT gating happens. Returnsfalsefor anonymous ops when IV is active; otherwise checksjwtTokenStore.getJwt(op.externalId) != null.handleFailUnauthorized(startingOp, ops, jwtTokenStore, handler)— same inner gate. Returnsfalsewhen IV inactive or op is anon so the caller falls through to the default drop-on-fail path. When handled: invalidates the JWT, fires the handler before waking waiters (otherwise anenqueueAndWaitcaller could return before the handler propagates to the app), wakes waiters withfalse, and re-queues ops without their waiters so they can retry once a fresh JWT arrives.Both extensions take their dependencies as parameters rather than accessing
OperationRepoprivate fields — keeps encapsulation intact and the extensions pure-ish.New
IdentityVerificationService(DI-registeredIStartableService):ConfigModelStore; reacts only to HYDRATE.suspendifyOnIO { operationRepo.awaitInitialized(); ... }— fix for an earlier race where anon-op purge ran against an empty in-memory queue on cold start because it fired beforeloadSavedOperationscompleted.useIdentityVerificationisREQUIRED. Always callsforceExecuteOperationsto wake the queue from the pre-HYDRATE deferral regardless of the new state.CoreModuleregistersIdentityVerificationServiceas anIStartableService.Two-layer gating
The memory-plan design is preserved:
OperationRepo.kt):if (IdentityVerificationGates.newCodePathsRun) extensionFunction(...). When off, old code path runs untouched.if (!IdentityVerificationGates.ivBehaviorActive) return <no-op>. Lets Phase 3 users (new code paths on, IV off) run through the new dispatch without any actual IV gating applied — this is how we validate the new code paths structurally before flipping IV active for each customer.What is NOT in this PR
OneSignal.login(externalId, jwt),updateUserJwt, JWT-invalidated listeners) — PR 4.Authorization: Bearerheader attachment — PR 5.JwtTokenStore.pruneToExternalIds()invocation on cold start — deferred; not needed yet.Testing
Unit testing
11 new tests, 789 existing tests still pass (2 pre-existing
SDKInitTestsfailures on the integration branch are unrelated — same 2 that fail onfeat/iv-foundation-01/feat/iv-oprepo-gating-02).New
IdentityVerificationServiceTests(5):start subscribes to ConfigModelStore.HYDRATE with REQUIRED purges anon ops then wakes the queue (after awaitInitialized)— usescoVerifyOrderto assert the purge runs afterawaitInitialized(), which is the whole point of the race fix.HYDRATE with NOT_REQUIRED wakes the queue but does NOT purge.HYDRATE with UNKNOWN wakes the queue (no purge).non-HYDRATE model replacement is ignored.New
OperationRepoTests(6):useIdentityVerificationtoNOT_REQUIREDand callingforceExecuteOperationsunblocks the queue.removeOperationsWithoutExternalIdfilters the queue AND removes the persisted ops from the store.setJwtInvalidatedHandlerround-trip via internal accessor.FAIL_UNAUTHORIZEDwith IV active: invalidates JWT, fires handler, re-queues ops (not dropped), wakesenqueueAndWaitcaller withfalse.FAIL_UNAUTHORIZEDwith IV inactive: handler NOT fired, default drop-on-fail path runs (op removed from store).Each IV test scopes its
IdentityVerificationGatesmutations intry/finallyto reset the singleton;Mocks()setsconfigModel.useIdentityVerification = JwtRequirement.NOT_REQUIREDby default so the pre-HYDRATE deferral doesn't hold up tests that aren't exercising IV state.Manual testing
Not applicable for this PR — no runtime behavior change for Phase 1 users (feature flag off,
jwt_requiredeventuallyfalse) once HYDRATE completes; IV-specific behavior is gated off pending the rest of the rollout series. The pre-HYDRATE deferral is covered by unit tests.Affected code checklist
Checklist
Overview
Testing
SDKInitTestsfailures are unrelated — same 2 on integration branch)Final pass