WIP: missing capabilities lead to NoSupportedApiException#6317
Draft
mahibi wants to merge 1 commit into
Draft
Conversation
crash is:
Exception com.nextcloud.talk.utils.NoSupportedApiException:
at com.nextcloud.talk.utils.ApiUtils.getConversationApiVersion (ApiUtils.kt:127)
at com.nextcloud.talk.chat.data.network.RetrofitChatNetwork.getRoom (RetrofitChatNetwork.kt:31)
at com.nextcloud.talk.conversationlist.data.network.OfflineFirstConversationsRepository$getRoom$1.invokeSuspend (OfflineFirstConversationsRepository.kt:69)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith (ContinuationImpl.kt:34)
at kotlinx.coroutines.DispatchedTask.run (DispatchedTask.kt:100)
at kotlinx.coroutines.internal.LimitedDispatcher$Worker.run (LimitedDispatcher.kt:124)
at kotlinx.coroutines.scheduling.TaskImpl.run (Tasks.kt:89)
at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely (CoroutineScheduler.kt:586)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask (CoroutineScheduler.kt:820)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker (CoroutineScheduler.kt:717)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run (CoroutineScheduler.kt:704)
Claude suggests to use defer to avoid the crash. But it will jus hide the root cause.
Some snippets from the claude chat:
● The root cause: getConversationApiVersion throws NoSupportedApiException synchronously during Observable construction (before .subscribe() is called). This exception bypassed the onError handler entirely and crashed the coroutine as an unhandled exception.
The fix wraps all three affected methods (getRoom, getCapabilities, joinRoom) in Observable.defer {}, which defers the Observable construction to subscription time — so any NoSupportedApiException now flows through onError and is handled gracefully (falling back to the local database in
OfflineFirstConversationsRepository.getRoom).
❯ why does getConversationApiVersion crash at all?
● The exception throws when none of the checked capability strings match anything in the user's spreed features list. That happens in two real scenarios:
1. capabilities is null — hasSpreedFeatureCapability silently returns false for every version checked, so the loop exhausts all candidates and hits the throw.
2. Stale/empty capabilities — the user object was loaded from the database before capabilities were fetched or refreshed, so spreedCapability?.features is empty or missing the expected "conversation-v4" / "conversation-v3" / "conversation-v2" / "mention-flag" entries.
In both cases the loop just falls through every version without matching, and throw NoSupportedApiException() is the only exit path. The capabilities not being present is the underlying trigger — the exception itself is the symptom.
❯ look why the capabilities could be null
● The stack trace came from OfflineFirstConversationsRepository, let me check NotificationWorker too since it directly calls chatNetworkDataSource.getRoom. Let me look at how the user object is obtained in the key callers.
Here's the root cause chain:
Capabilities come from CapabilitiesWorker — a background WorkManager job that fetches capabilities from the server and writes them to the database. They are null in the database by default and only populated after this worker succeeds.
Three scenarios where capabilities end up null when getRoom is called:
1. Race on first install / account add — CapabilitiesWorker is queued after account verification (AccountVerificationActivity.kt:473), but if a push notification arrives before the worker finishes (or on the next app cold start before it runs), NotificationWorker loads the user via getUserWithId and gets
a User with capabilities = null straight from the database.
2. CapabilitiesWorker failure — it retries 3 times (line 129), but on network failure it just posts an EventStatus failure event and exits without persisting anything. Capabilities stay null or stale forever until the next successful run.
3. App startup race in ChatViewModel — getRoom at line 1418 uses currentUser which is set at initialization. If getRoom is triggered before the CapabilitiesWorker result is persisted to the database and re-read into the in-memory User object, the user carried in memory still has capabilities = null.
The Observable.defer fix you now have handles all three cases correctly — the NoSupportedApiException flows to onError instead of crashing the coroutine, allowing graceful fallback to the local database.
❯ the root cause will just be hidden when i do this..?
● Yes, you're right. The Observable.defer fix stops the crash and falls back to local DB, but the API call still silently fails — the user with null capabilities never gets a refreshed conversation from the server.
The actual root cause is that getRoom is being called with a user whose capabilities haven't been loaded yet. The real fix depends on what's causing that:
- If it's a race (CapabilitiesWorker hasn't finished yet), the callers should wait for capabilities before calling getRoom, or trigger a capabilities fetch first.
- If it's the NotificationWorker path specifically, it could fetch capabilities inline before calling getRoom.
- If it's a persistent null (CapabilitiesWorker keeps failing), that's a separate bug that needs its own fix.
Before deciding the fix direction — do you know in which caller context this crash actually occurred? The stack trace points to OfflineFirstConversationsRepository, which is called from ChatViewModel.getRoom and NotificationWorker. Knowing which one triggered it would narrow down the right fix.
Signed-off-by: Marcel Hibbe <dev@mhibbe.de>
Contributor
|
APK file: https://github.com/nextcloud/talk-android/actions/runs/26969118712/artifacts/7418983494 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
crash is:
Claude suggests to use defer to avoid the crash. But it will jus hide the root cause. Some snippets from the claude chat:
● The root cause: getConversationApiVersion throws NoSupportedApiException synchronously during Observable construction (before .subscribe() is called). This exception bypassed the onError handler entirely and crashed the coroutine as an unhandled exception.
The fix wraps all three affected methods (getRoom, getCapabilities, joinRoom) in Observable.defer {}, which defers the Observable construction to subscription time — so any NoSupportedApiException now flows through onError and is handled gracefully (falling back to the local database in
OfflineFirstConversationsRepository.getRoom).
❯ why does getConversationApiVersion crash at all? ● The exception throws when none of the checked capability strings match anything in the user's spreed features list. That happens in two real scenarios:
In both cases the loop just falls through every version without matching, and throw NoSupportedApiException() is the only exit path. The capabilities not being present is the underlying trigger — the exception itself is the symptom.
❯ look why the capabilities could be null
● The stack trace came from OfflineFirstConversationsRepository, let me check NotificationWorker too since it directly calls chatNetworkDataSource.getRoom. Let me look at how the user object is obtained in the key callers.
Here's the root cause chain:
Capabilities come from CapabilitiesWorker — a background WorkManager job that fetches capabilities from the server and writes them to the database. They are null in the database by default and only populated after this worker succeeds.
Three scenarios where capabilities end up null when getRoom is called:
The Observable.defer fix you now have handles all three cases correctly — the NoSupportedApiException flows to onError instead of crashing the coroutine, allowing graceful fallback to the local database.
❯ the root cause will just be hidden when i do this..? ● Yes, you're right. The Observable.defer fix stops the crash and falls back to local DB, but the API call still silently fails — the user with null capabilities never gets a refreshed conversation from the server.
The actual root cause is that getRoom is being called with a user whose capabilities haven't been loaded yet. The real fix depends on what's causing that:
Before deciding the fix direction — do you know in which caller context this crash actually occurred? The stack trace points to OfflineFirstConversationsRepository, which is called from ChatViewModel.getRoom and NotificationWorker. Knowing which one triggered it would narrow down the right fix.
🏁 Checklist
/backport to stable-xx.x