fix: infinite sync after MLS got enabled while logged out - WPB-24543#4532
Open
David-Henner wants to merge 5 commits intorelease/cycle-4.16from
Open
fix: infinite sync after MLS got enabled while logged out - WPB-24543#4532David-Henner wants to merge 5 commits intorelease/cycle-4.16from
David-Henner wants to merge 5 commits intorelease/cycle-4.16from
Conversation
|
Contributor
Test Results 7 files 982 suites 11m 6s ⏱️ Results for commit c393eca. Summary: workflow run #23907886957 |
netbe
approved these changes
Apr 3, 2026
Collaborator
netbe
left a comment
There was a problem hiding this comment.
LGTM:)
I just wonder if this for 4.16.1 or 4.17.0?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



Issue
After logging out and back in on a team where MLS was enabled while the user was logged out, the incremental sync enters an infinite sync loop because of an error:
The sync eventually stops thanks to the
BackoffRetrierand an alert is shown with the error above. Afterwards we also observe a few EAR decryption errors.Causes
A — Zombie websocket stores events with stale EAR keys.
ZMUserSession.tearDown()setsyncAgent = nilwithout callingsuspend()first. This left the websocket alive after logout. When the user logged back in, a liveuser.clientAddevent arrived on this zombie connection and was encrypted into the event store using EAR public keys that had been fetched at the start of the previous session'sIncrementalSync.B — EAR keys were not deleted on logout.
Account.deleteKeychainItems()removed the app lock passcode and CoreCrypto keys but not EAR keys. On re-login,clientRegistrationDidSucceedcalledenableEncryptionAtRest, which deleted the old EAR keys and generated new ones. Any events already stored with the old public key became permanently undecryptable.In short: the zombie websocket stores a freshly arrived event using old EAR keys, then key rotation destroys those keys, leaving the stored event orphaned.
Solutions
suspend()on the sync agent before teardown, ensuring the push channel and its captured EAR public keys are released.RemoveEARKeysUseCasetoAccount.deleteKeychainItems(), ensuring no orphaned EAR keys survive in the keychain after logout.Changes Made
SyncAgent: AddedtearDown()method that nils the delegate and fires a task to callsuspend(), which cancels the ongoing sync and closes the push channel.ZMUserSession.tearDown(): ReplacedsyncAgent?.delegate = nilwithsyncAgent?.tearDown().RemoveEARKeysUseCase(new): Use case that deletes all five EAR keychain entries (primary/secondary public, primary/secondary private, database key) for a given account ID.Account.deleteKeychainItems(): AddedRemoveEARKeysUseCase().invoke(accountID:)alongside the existing CoreCrypto key cleanup.RemoveEARKeysUseCaseTests,EARKeyRepositoryTests, extendedAccountTestsandSyncAgentTests.Testing
Checklist
[WPB-XXX].