Skip to content

feat: hash PII (email/SMS) in SharedPreferences at rest#2620

Merged
sherwinski merged 1 commit intomainfrom
sherwin/sdk-4202-pii-hashing-main
Apr 21, 2026
Merged

feat: hash PII (email/SMS) in SharedPreferences at rest#2620
sherwinski merged 1 commit intomainfrom
sherwin/sdk-4202-pii-hashing-main

Conversation

@sherwinski
Copy link
Copy Markdown
Contributor

Description

One Line Summary

Cherry-pick of #2614 from 5.7-main to main.

Details

Motivation

Port the PII hashing feature to main so it ships in the next major release as well. See #2614 for full context, discussion, and manual testing details.

Scope

Identical to #2614 — hashes email and SMS subscription addresses (SHA-256) at the serialization boundary before writing to SharedPreferences. Push tokens are unchanged. No public API changes.

Testing

Unit testing

Includes all unit tests from #2614: PIIHasherTests, SubscriptionModelStoreTests, and hash-aware tests in SubscriptionManagerTests.

Manual testing

Covered in #2614 — same code, clean cherry-pick with no conflicts.

Affected code checklist

  • Notifications
  • Outcomes
  • Sessions
  • In-App Messaging
  • REST API requests
  • Public API changes

Checklist

Overview

  • I have filled out all REQUIRED sections above
  • PR does one thing
  • Any Public API changes are explained in the PR details and conform to existing APIs

Testing

  • I have included test coverage for these changes, or explained why they are not needed
  • All automated tests pass, or I explained why that is not possible
  • I have personally tested this on my device, or explained why that is not possible

Final pass

  • Code is as readable as possible.
  • I have reviewed this PR myself, ensuring it meets each checklist item

Copilot AI review requested due to automatic review settings April 21, 2026 17:28
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Ports the PII-at-rest hashing feature to main by ensuring email and SMS subscription addresses are SHA-256 hashed at the SharedPreferences persistence boundary, while keeping in-memory values unchanged and maintaining pre-/post-hydration behavior.

Changes:

  • Added PIIHasher utility (SHA-256 + “looks hashed” detection) and unit tests.
  • Added ModelStore.transformJsonForPersistence(...) hook and implemented hashing in SubscriptionModelStore for EMAIL/SMS (excluding PUSH).
  • Updated subscription removal, lookup, logging, and public getters to behave correctly when persisted addresses are hashed (cold-start window).

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
OneSignalSDK/onesignal/core/src/main/java/com/onesignal/common/PIIHasher.kt New SHA-256 hashing utility + hash detection.
OneSignalSDK/onesignal/core/src/main/java/com/onesignal/common/modeling/ModelStore.kt Adds persistence-time JSON transform hook used for hashing at rest.
OneSignalSDK/onesignal/core/src/main/java/com/onesignal/user/internal/subscriptions/SubscriptionModelStore.kt Overrides persistence hook to hash EMAIL/SMS address before writing to prefs.
OneSignalSDK/onesignal/core/src/main/java/com/onesignal/user/internal/subscriptions/impl/SubscriptionManager.kt Hash-aware removal + redacted logging for non-push addresses.
OneSignalSDK/onesignal/core/src/main/java/com/onesignal/user/internal/subscriptions/SubscriptionList.kt Hash-aware lookup for email/SMS subscriptions.
OneSignalSDK/onesignal/core/src/main/java/com/onesignal/user/internal/EmailSubscription.kt Returns empty string from getter when address is hashed (cold-start window).
OneSignalSDK/onesignal/core/src/main/java/com/onesignal/user/internal/SmsSubscription.kt Returns empty string from getter when address is hashed (cold-start window).
OneSignalSDK/onesignal/core/src/test/java/com/onesignal/common/PIIHasherTests.kt Unit tests for hashing + isHashed detection.
OneSignalSDK/onesignal/core/src/test/java/com/onesignal/user/internal/subscriptions/SubscriptionModelStoreTests.kt Verifies persistence hashes EMAIL/SMS only and is idempotent.
OneSignalSDK/onesignal/core/src/test/java/com/onesignal/user/internal/subscriptions/SubscriptionManagerTests.kt Verifies hash-aware removal/lookups and getter behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +39 to +43
val hashed = PIIHasher.hash(email)
return emails.firstOrNull {
val address = (it as Subscription).model.address
address == email || address == hashed
}
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getByEmail performs an unsafe cast (it as Subscription) on an IEmailSubscription. If SubscriptionList ever contains a different IEmailSubscription implementation, this will throw a ClassCastException at runtime. Consider using a safe cast (as?) and only reading model.address when the cast succeeds, or narrowing the emails list to the expected internal Subscription-backed types before comparing.

Copilot uses AI. Check for mistakes.
Comment on lines +52 to +56
val hashed = PIIHasher.hash(sms)
return smss.firstOrNull {
val address = (it as Subscription).model.address
address == sms || address == hashed
}
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getBySMS performs an unsafe cast (it as Subscription) on an ISmsSubscription, which can crash with ClassCastException if the list ever contains another ISmsSubscription implementation. Use a safe cast (as?) or filter to Subscription-backed instances before reading model.address.

Copilot uses AI. Check for mistakes.
Comment on lines 10 to +14
override val email: String
get() = model.address
get() {
val address = model.address
return if (PIIHasher.isHashed(address)) "" else address
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 After a cold restart, IEmailSubscription.email (and ISmsSubscription.number) returns "" instead of the user's address until server hydration completes, because addresses are now loaded from SharedPreferences in hashed form and EmailSubscription.email returns "" whenever PIIHasher.isHashed(address) is true. For offline-first apps or any scenario where network is unavailable at startup, hydration may never occur, leaving these properties permanently empty — and the PR description's claim of "No public API changes" is inaccurate. Callers should use subscriptions.emails.count() > 0 to check for subscription existence, and the IEmailSubscription.email / ISmsSubscription.number API docs should be updated to document the empty-string semantics.

Extended reasoning...

What the bug is and how it manifests

This PR hashes email and SMS addresses before writing them to SharedPreferences (in SubscriptionModelStore.transformJsonForPersistence). After a cold restart, ModelStore.load() reads the persisted JSON back into memory — but since only the hash was stored, SubscriptionModel.address is initialized with the SHA-256 hex string. EmailSubscription.email (EmailSubscription.kt:13) then checks PIIHasher.isHashed(address) and returns "" if true. The result: the public property IEmailSubscription.email returns an empty string from process start until server hydration replaces the model with the plaintext address.

The specific code path that triggers it

  1. App calls addEmailSubscription("user@example.com") — model stored in memory with raw address.
  2. SubscriptionModelStore.transformJsonForPersistence hashes it: SharedPreferences now holds {"address": ""}.
  3. Cold restart: ModelStore.load() reads persisted JSON and sets SubscriptionModel.address = "".
  4. EmailSubscription.email (line 13): PIIHasher.isHashed("") returns true, so returns "".
  5. RefreshUserOperationExecutor eventually calls subscriptionsModelStore.replaceAll(..., HYDRATE), restoring the plaintext address. Until this completes, email is "".

Why existing code does not prevent it

The hashing is intentionally done at the serialization boundary so the in-memory model retains the plaintext value during the session. However, after a cold restart the in-memory model is rebuilt exclusively from the persisted (hashed) JSON — there is no mechanism to keep or recover the plaintext before server hydration.

What the impact would be

Any developer code that reads email.email or sms.number before hydration completes will receive an empty string: account-management UIs will show blank addresses; existence checks like if (email.email != "") will silently skip valid subscriptions; analytics or logging that captures the address will capture nothing. For offline-first apps — or apps whose network calls fail at startup — hydration never runs, leaving these properties permanently empty for the session. The PR checklist marks "No public API changes" as satisfied, but this IS a behavioral change to the IEmailSubscription and ISmsSubscription public contracts.

Addressing the refutation

The refuter correctly notes this is intentional design (SHA-256 is irreversible, so returning "" is the only reasonable choice), that the hydration window is typically short for online apps, and that checking email.email != "" for existence is a misuse pattern. All of this is true. However, the question is not whether the design is sound internally — it is whether the public API contract has changed without documentation. The IEmailSubscription.email KDoc says "The email address notifications will be sent to for this subscription", with no mention of returning "" for a hashed/pending state. The PR own unit test ("email getter returns empty string when address is hashed") proves the behavior is real and intentional, which makes the absence of public API documentation even more notable. Intentional design choices that change observable public API behavior still need to be documented.

How to fix it

The code logic itself may be acceptable as-is. The required fixes are: (1) update the IEmailSubscription.email and ISmsSubscription.number KDoc to state that the property returns "" when the address is pending server hydration after a cold restart; (2) document the recommended alternative for checking subscription existence: subscriptions.emails.count() > 0 or subscriptions.getByEmail(rawEmail); (3) update the PR checklist to accurately reflect that this is a public API behavioral change.

Step-by-step proof

  1. First run: OneSignal.User.addEmail("alice@example.com") sets SubscriptionModel.address = "alice@example.com" in memory.
  2. persist() is called: transformJsonForPersistence hashes it, SharedPreferences stores {"type":"EMAIL","address":"", ...}.
  3. User force-closes app (cold restart). Network is unavailable (airplane mode).
  4. ModelStore.load() reads SharedPreferences and sets SubscriptionModel.address = "".
  5. Developer calls OneSignal.User.subscriptions.emails[0].email: EmailSubscription.email getter runs PIIHasher.isHashed("") which returns true, so the getter returns "".
  6. subscriptions.emails.count() returns 1 (subscription exists), but subscriptions.emails[0].email returns "" — two APIs disagree on whether the subscription has an address.
  7. Because the device is offline, RefreshUserOperationExecutor never runs, so email remains "" for the entire session.

Comment on lines 222 to 234

/**
* Hook for subclasses to transform a model's JSON representation before it is
* written to SharedPreferences. The default implementation returns the JSON unchanged.
*/
protected open fun transformJsonForPersistence(
model: TModel,
json: JSONObject,
): JSONObject = json

override fun subscribe(handler: IModelStoreChangeHandler<TModel>) = changeSubscription.subscribe(handler)

override fun unsubscribe(handler: IModelStoreChangeHandler<TModel>) = changeSubscription.unsubscribe(handler)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 This PR hashes email/SMS addresses in SubscriptionModelStore but leaves OperationModelStore unaddressed: pending CreateSubscriptionOperation and UpdateSubscriptionOperation objects contain a raw address field populated from model.address (plaintext email/phone), and since OperationModelStore does not override transformJsonForPersistence, these are serialized to SharedPreferences under MODEL_STORE_PREFIX + "operations" with no hashing. On a rooted device or via ADB backup the plaintext PII is still fully exposed, leaving the stated security goal only half-achieved. Note: fixing this is non-trivial because operations must retain the raw address to execute the API call (SHA-256 is irreversible).

Extended reasoning...

What the bug is and how it manifests

The PR introduces a transformJsonForPersistence hook in ModelStore and overrides it in SubscriptionModelStore to SHA-256-hash email and SMS addresses before writing to SharedPreferences. However, a second persistence path exists: OperationModelStore, which queues pending API operations such as CreateSubscriptionOperation and UpdateSubscriptionOperation. OperationModelStore extends ModelStore<Operation> but does not override transformJsonForPersistence, so the base no-op implementation is used and the raw address field in those operation objects is written to SharedPreferences unchanged.

The specific code path that triggers it

  1. A subscription is added or updated; SubscriptionModelStoreListener.getAddOperation() (line 29) and getUpdateOperation() (line 52) each pass model.address — the raw, unhashed email or phone number — to CreateSubscriptionOperation / UpdateSubscriptionOperation.
  2. These operations are enqueued in OperationModelStore (line 32 of OperationModelStore.kt), whose persist() call iterates all pending operations and calls transformJsonForPersistence(model, model.toJSON()). Because there is no override, the JSON is returned unchanged.
  3. The resulting JSON array — containing plaintext email/SMS in the address field — is saved to SharedPreferences under MODEL_STORE_PREFIX + "operations".

Why existing code does not prevent it

The PR's own test "persist keeps in-memory model address as raw value" explicitly confirms that in-memory SubscriptionModel.address stays raw after add(). The hashing is a JSON-output-only transformation applied only by SubscriptionModelStore.transformJsonForPersistence. OperationModelStore has no such override, so it serialises whatever Operation.toJSON() produces — which includes the raw address string.

Impact

Any device that is rooted, has ADB backup enabled, or has been analysed with a forensic tool would expose plaintext email addresses and phone numbers stored in the "operations" SharedPreferences key. This directly contradicts the PR's stated goal of keeping PII out of SharedPreferences in plain text. The window of exposure is any time an operation is pending — between when the user adds/updates a subscription and when the network call completes and the operation is dequeued.

Concrete step-by-step proof

  1. Call OneSignal.User.addEmail("victim@example.com").
  2. SubscriptionManager.addSubscriptionToModels(EMAIL, "victim@example.com") creates a SubscriptionModel with address = "victim@example.com" and calls _subscriptionModelStore.add(model).
  3. SubscriptionModelStoreListener.onModelAdded fires, calling getAddOperation(model) (line 29 of SubscriptionModelStoreListener.kt) which passes model.address (= "victim@example.com") into CreateSubscriptionOperation.
  4. The operation is added to OperationModelStore; persist() runs using the no-op base transformJsonForPersistence.
  5. SharedPreferences now contains {..."address":"victim@example.com"...} under MODEL_STORE_PREFIX + "operations".
  6. adb backup or a /data/data/ read on a rooted device exposes the plaintext email.

How to fix it

Fixing this is non-trivial because SubscriptionOperationExecutor reads the address field from the persisted JSON to construct the API request body (the value must be the actual email/phone number, not its hash). A possible approach: store a hashed surrogate in the persisted operation JSON but keep the raw address in a separate in-memory-only field, rehydrating it from the live SubscriptionModel when the operation is executed — or at minimum document this as a known gap.

@sherwinski sherwinski merged commit 5600289 into main Apr 21, 2026
9 of 10 checks passed
@sherwinski sherwinski deleted the sherwin/sdk-4202-pii-hashing-main branch April 21, 2026 18:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants