feat: hash PII (email/SMS) in SharedPreferences at rest#2620
Conversation
There was a problem hiding this comment.
Pull request overview
Ports the PII-at-rest hashing feature to main by ensuring email and SMS subscription addresses are SHA-256 hashed at the SharedPreferences persistence boundary, while keeping in-memory values unchanged and maintaining pre-/post-hydration behavior.
Changes:
- Added
PIIHasherutility (SHA-256 + “looks hashed” detection) and unit tests. - Added
ModelStore.transformJsonForPersistence(...)hook and implemented hashing inSubscriptionModelStorefor EMAIL/SMS (excluding PUSH). - Updated subscription removal, lookup, logging, and public getters to behave correctly when persisted addresses are hashed (cold-start window).
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| OneSignalSDK/onesignal/core/src/main/java/com/onesignal/common/PIIHasher.kt | New SHA-256 hashing utility + hash detection. |
| OneSignalSDK/onesignal/core/src/main/java/com/onesignal/common/modeling/ModelStore.kt | Adds persistence-time JSON transform hook used for hashing at rest. |
| OneSignalSDK/onesignal/core/src/main/java/com/onesignal/user/internal/subscriptions/SubscriptionModelStore.kt | Overrides persistence hook to hash EMAIL/SMS address before writing to prefs. |
| OneSignalSDK/onesignal/core/src/main/java/com/onesignal/user/internal/subscriptions/impl/SubscriptionManager.kt | Hash-aware removal + redacted logging for non-push addresses. |
| OneSignalSDK/onesignal/core/src/main/java/com/onesignal/user/internal/subscriptions/SubscriptionList.kt | Hash-aware lookup for email/SMS subscriptions. |
| OneSignalSDK/onesignal/core/src/main/java/com/onesignal/user/internal/EmailSubscription.kt | Returns empty string from getter when address is hashed (cold-start window). |
| OneSignalSDK/onesignal/core/src/main/java/com/onesignal/user/internal/SmsSubscription.kt | Returns empty string from getter when address is hashed (cold-start window). |
| OneSignalSDK/onesignal/core/src/test/java/com/onesignal/common/PIIHasherTests.kt | Unit tests for hashing + isHashed detection. |
| OneSignalSDK/onesignal/core/src/test/java/com/onesignal/user/internal/subscriptions/SubscriptionModelStoreTests.kt | Verifies persistence hashes EMAIL/SMS only and is idempotent. |
| OneSignalSDK/onesignal/core/src/test/java/com/onesignal/user/internal/subscriptions/SubscriptionManagerTests.kt | Verifies hash-aware removal/lookups and getter behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| val hashed = PIIHasher.hash(email) | ||
| return emails.firstOrNull { | ||
| val address = (it as Subscription).model.address | ||
| address == email || address == hashed | ||
| } |
There was a problem hiding this comment.
getByEmail performs an unsafe cast (it as Subscription) on an IEmailSubscription. If SubscriptionList ever contains a different IEmailSubscription implementation, this will throw a ClassCastException at runtime. Consider using a safe cast (as?) and only reading model.address when the cast succeeds, or narrowing the emails list to the expected internal Subscription-backed types before comparing.
| val hashed = PIIHasher.hash(sms) | ||
| return smss.firstOrNull { | ||
| val address = (it as Subscription).model.address | ||
| address == sms || address == hashed | ||
| } |
There was a problem hiding this comment.
getBySMS performs an unsafe cast (it as Subscription) on an ISmsSubscription, which can crash with ClassCastException if the list ever contains another ISmsSubscription implementation. Use a safe cast (as?) or filter to Subscription-backed instances before reading model.address.
| override val email: String | ||
| get() = model.address | ||
| get() { | ||
| val address = model.address | ||
| return if (PIIHasher.isHashed(address)) "" else address | ||
| } |
There was a problem hiding this comment.
🔴 After a cold restart, IEmailSubscription.email (and ISmsSubscription.number) returns "" instead of the user's address until server hydration completes, because addresses are now loaded from SharedPreferences in hashed form and EmailSubscription.email returns "" whenever PIIHasher.isHashed(address) is true. For offline-first apps or any scenario where network is unavailable at startup, hydration may never occur, leaving these properties permanently empty — and the PR description's claim of "No public API changes" is inaccurate. Callers should use subscriptions.emails.count() > 0 to check for subscription existence, and the IEmailSubscription.email / ISmsSubscription.number API docs should be updated to document the empty-string semantics.
Extended reasoning...
What the bug is and how it manifests
This PR hashes email and SMS addresses before writing them to SharedPreferences (in SubscriptionModelStore.transformJsonForPersistence). After a cold restart, ModelStore.load() reads the persisted JSON back into memory — but since only the hash was stored, SubscriptionModel.address is initialized with the SHA-256 hex string. EmailSubscription.email (EmailSubscription.kt:13) then checks PIIHasher.isHashed(address) and returns "" if true. The result: the public property IEmailSubscription.email returns an empty string from process start until server hydration replaces the model with the plaintext address.
The specific code path that triggers it
- App calls addEmailSubscription("user@example.com") — model stored in memory with raw address.
- SubscriptionModelStore.transformJsonForPersistence hashes it: SharedPreferences now holds {"address": ""}.
- Cold restart: ModelStore.load() reads persisted JSON and sets SubscriptionModel.address = "".
- EmailSubscription.email (line 13): PIIHasher.isHashed("") returns true, so returns "".
- RefreshUserOperationExecutor eventually calls subscriptionsModelStore.replaceAll(..., HYDRATE), restoring the plaintext address. Until this completes, email is "".
Why existing code does not prevent it
The hashing is intentionally done at the serialization boundary so the in-memory model retains the plaintext value during the session. However, after a cold restart the in-memory model is rebuilt exclusively from the persisted (hashed) JSON — there is no mechanism to keep or recover the plaintext before server hydration.
What the impact would be
Any developer code that reads email.email or sms.number before hydration completes will receive an empty string: account-management UIs will show blank addresses; existence checks like if (email.email != "") will silently skip valid subscriptions; analytics or logging that captures the address will capture nothing. For offline-first apps — or apps whose network calls fail at startup — hydration never runs, leaving these properties permanently empty for the session. The PR checklist marks "No public API changes" as satisfied, but this IS a behavioral change to the IEmailSubscription and ISmsSubscription public contracts.
Addressing the refutation
The refuter correctly notes this is intentional design (SHA-256 is irreversible, so returning "" is the only reasonable choice), that the hydration window is typically short for online apps, and that checking email.email != "" for existence is a misuse pattern. All of this is true. However, the question is not whether the design is sound internally — it is whether the public API contract has changed without documentation. The IEmailSubscription.email KDoc says "The email address notifications will be sent to for this subscription", with no mention of returning "" for a hashed/pending state. The PR own unit test ("email getter returns empty string when address is hashed") proves the behavior is real and intentional, which makes the absence of public API documentation even more notable. Intentional design choices that change observable public API behavior still need to be documented.
How to fix it
The code logic itself may be acceptable as-is. The required fixes are: (1) update the IEmailSubscription.email and ISmsSubscription.number KDoc to state that the property returns "" when the address is pending server hydration after a cold restart; (2) document the recommended alternative for checking subscription existence: subscriptions.emails.count() > 0 or subscriptions.getByEmail(rawEmail); (3) update the PR checklist to accurately reflect that this is a public API behavioral change.
Step-by-step proof
- First run: OneSignal.User.addEmail("alice@example.com") sets SubscriptionModel.address = "alice@example.com" in memory.
- persist() is called: transformJsonForPersistence hashes it, SharedPreferences stores {"type":"EMAIL","address":"", ...}.
- User force-closes app (cold restart). Network is unavailable (airplane mode).
- ModelStore.load() reads SharedPreferences and sets SubscriptionModel.address = "".
- Developer calls OneSignal.User.subscriptions.emails[0].email: EmailSubscription.email getter runs PIIHasher.isHashed("") which returns true, so the getter returns "".
- subscriptions.emails.count() returns 1 (subscription exists), but subscriptions.emails[0].email returns "" — two APIs disagree on whether the subscription has an address.
- Because the device is offline, RefreshUserOperationExecutor never runs, so email remains "" for the entire session.
|
|
||
| /** | ||
| * Hook for subclasses to transform a model's JSON representation before it is | ||
| * written to SharedPreferences. The default implementation returns the JSON unchanged. | ||
| */ | ||
| protected open fun transformJsonForPersistence( | ||
| model: TModel, | ||
| json: JSONObject, | ||
| ): JSONObject = json | ||
|
|
||
| override fun subscribe(handler: IModelStoreChangeHandler<TModel>) = changeSubscription.subscribe(handler) | ||
|
|
||
| override fun unsubscribe(handler: IModelStoreChangeHandler<TModel>) = changeSubscription.unsubscribe(handler) |
There was a problem hiding this comment.
🔴 This PR hashes email/SMS addresses in SubscriptionModelStore but leaves OperationModelStore unaddressed: pending CreateSubscriptionOperation and UpdateSubscriptionOperation objects contain a raw address field populated from model.address (plaintext email/phone), and since OperationModelStore does not override transformJsonForPersistence, these are serialized to SharedPreferences under MODEL_STORE_PREFIX + "operations" with no hashing. On a rooted device or via ADB backup the plaintext PII is still fully exposed, leaving the stated security goal only half-achieved. Note: fixing this is non-trivial because operations must retain the raw address to execute the API call (SHA-256 is irreversible).
Extended reasoning...
What the bug is and how it manifests
The PR introduces a transformJsonForPersistence hook in ModelStore and overrides it in SubscriptionModelStore to SHA-256-hash email and SMS addresses before writing to SharedPreferences. However, a second persistence path exists: OperationModelStore, which queues pending API operations such as CreateSubscriptionOperation and UpdateSubscriptionOperation. OperationModelStore extends ModelStore<Operation> but does not override transformJsonForPersistence, so the base no-op implementation is used and the raw address field in those operation objects is written to SharedPreferences unchanged.
The specific code path that triggers it
- A subscription is added or updated;
SubscriptionModelStoreListener.getAddOperation()(line 29) andgetUpdateOperation()(line 52) each passmodel.address— the raw, unhashed email or phone number — toCreateSubscriptionOperation/UpdateSubscriptionOperation. - These operations are enqueued in
OperationModelStore(line 32 ofOperationModelStore.kt), whosepersist()call iterates all pending operations and callstransformJsonForPersistence(model, model.toJSON()). Because there is no override, the JSON is returned unchanged. - The resulting JSON array — containing plaintext email/SMS in the
addressfield — is saved to SharedPreferences underMODEL_STORE_PREFIX + "operations".
Why existing code does not prevent it
The PR's own test "persist keeps in-memory model address as raw value" explicitly confirms that in-memory SubscriptionModel.address stays raw after add(). The hashing is a JSON-output-only transformation applied only by SubscriptionModelStore.transformJsonForPersistence. OperationModelStore has no such override, so it serialises whatever Operation.toJSON() produces — which includes the raw address string.
Impact
Any device that is rooted, has ADB backup enabled, or has been analysed with a forensic tool would expose plaintext email addresses and phone numbers stored in the "operations" SharedPreferences key. This directly contradicts the PR's stated goal of keeping PII out of SharedPreferences in plain text. The window of exposure is any time an operation is pending — between when the user adds/updates a subscription and when the network call completes and the operation is dequeued.
Concrete step-by-step proof
- Call
OneSignal.User.addEmail("victim@example.com"). SubscriptionManager.addSubscriptionToModels(EMAIL, "victim@example.com")creates aSubscriptionModelwithaddress = "victim@example.com"and calls_subscriptionModelStore.add(model).SubscriptionModelStoreListener.onModelAddedfires, callinggetAddOperation(model)(line 29 ofSubscriptionModelStoreListener.kt) which passesmodel.address(="victim@example.com") intoCreateSubscriptionOperation.- The operation is added to
OperationModelStore;persist()runs using the no-op basetransformJsonForPersistence. - SharedPreferences now contains
{..."address":"victim@example.com"...}underMODEL_STORE_PREFIX + "operations". adb backupor a/data/data/read on a rooted device exposes the plaintext email.
How to fix it
Fixing this is non-trivial because SubscriptionOperationExecutor reads the address field from the persisted JSON to construct the API request body (the value must be the actual email/phone number, not its hash). A possible approach: store a hashed surrogate in the persisted operation JSON but keep the raw address in a separate in-memory-only field, rehydrating it from the live SubscriptionModel when the operation is executed — or at minimum document this as a known gap.
Description
One Line Summary
Cherry-pick of #2614 from
5.7-maintomain.Details
Motivation
Port the PII hashing feature to
mainso it ships in the next major release as well. See #2614 for full context, discussion, and manual testing details.Scope
Identical to #2614 — hashes email and SMS subscription addresses (SHA-256) at the serialization boundary before writing to SharedPreferences. Push tokens are unchanged. No public API changes.
Testing
Unit testing
Includes all unit tests from #2614:
PIIHasherTests,SubscriptionModelStoreTests, and hash-aware tests inSubscriptionManagerTests.Manual testing
Covered in #2614 — same code, clean cherry-pick with no conflicts.
Affected code checklist
Checklist
Overview
Testing
Final pass