fix(session-replay): Stabilize video assembly#8041
Conversation
Extract the replay video-assembly fixes from the capture-backoff branch so they can land and ship independently: - Retain the last frame captured before a segment window so recovered and resumed segments start with the correct screen state instead of dropping it (this moves the recovered replay start timestamp to the retained frame's time). - Fill and bound fractional gaps between captured frames so video timing stays stable when captures are skipped under load. - Use half-open video windows to avoid duplicating boundary frames in consecutive segments. - Drop video segments that contain no frames instead of emitting empty videos. - Keep video timing stable when a cached frame image cannot be read. - Serialize oldestFrameDate reads on the processing queue.
5b86990 to
cb8565a
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #8041 +/- ##
=============================================
+ Coverage 87.601% 87.633% +0.031%
=============================================
Files 559 559
Lines 32174 32289 +115
Branches 13158 13223 +65
=============================================
+ Hits 28185 28296 +111
- Misses 3941 3944 +3
- Partials 48 49 +1
... and 6 files with indirect coverage changes Continue to review full report in Codecov by Harness.
|
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 6f75b41. Configure here.
Removed merge conflict markers and updated changelog for version 9.17.1.
📲 Install BuildsiOS
|
Performance metrics 🚀
|
| Revision | Plain | With Sentry | Diff |
|---|---|---|---|
| 909f4ae | 1222.85 ms | 1253.26 ms | 30.41 ms |
| 5a9c804 | 1220.29 ms | 1255.17 ms | 34.88 ms |
| f213b0b | 1220.62 ms | 1251.59 ms | 30.97 ms |
| 466b8d9 | 1227.17 ms | 1252.85 ms | 25.68 ms |
| 36c25b3 | 1229.10 ms | 1255.83 ms | 26.73 ms |
| 8bb6aeb | 1215.82 ms | 1244.84 ms | 29.02 ms |
| 58e87fd | 1234.27 ms | 1262.09 ms | 27.82 ms |
| e56c394 | 1229.37 ms | 1245.86 ms | 16.49 ms |
| 4c1337a | 1230.65 ms | 1252.98 ms | 22.33 ms |
| 39655d1 | 1224.05 ms | 1258.38 ms | 34.33 ms |
App size
| Revision | Plain | With Sentry | Diff |
|---|---|---|---|
| 909f4ae | 24.14 KiB | 1.16 MiB | 1.13 MiB |
| 5a9c804 | 24.14 KiB | 1.15 MiB | 1.13 MiB |
| f213b0b | 24.14 KiB | 1.17 MiB | 1.14 MiB |
| 466b8d9 | 24.14 KiB | 1.18 MiB | 1.15 MiB |
| 36c25b3 | 24.14 KiB | 1.18 MiB | 1.16 MiB |
| 8bb6aeb | 24.14 KiB | 1.17 MiB | 1.15 MiB |
| 58e87fd | 24.14 KiB | 1.17 MiB | 1.14 MiB |
| e56c394 | 24.14 KiB | 1.16 MiB | 1.13 MiB |
| 4c1337a | 24.14 KiB | 1.17 MiB | 1.15 MiB |
| 39655d1 | 24.14 KiB | 1.16 MiB | 1.14 MiB |
Previous results on branch: fix/replay-video-assembly
Startup times
| Revision | Plain | With Sentry | Diff |
|---|---|---|---|
| f98aa1c | 1229.22 ms | 1259.48 ms | 30.26 ms |
| 4832942 | 1230.43 ms | 1265.88 ms | 35.45 ms |
| e8e4651 | 1217.60 ms | 1247.39 ms | 29.79 ms |
| f17e4d9 | 1224.52 ms | 1255.33 ms | 30.81 ms |
| 104ce0c | 1215.65 ms | 1245.04 ms | 29.39 ms |
App size
| Revision | Plain | With Sentry | Diff |
|---|---|---|---|
| f98aa1c | 24.14 KiB | 1.18 MiB | 1.16 MiB |
| 4832942 | 24.14 KiB | 1.18 MiB | 1.16 MiB |
| e8e4651 | 24.14 KiB | 1.18 MiB | 1.16 MiB |
| f17e4d9 | 24.14 KiB | 1.18 MiB | 1.16 MiB |
| 104ce0c | 24.14 KiB | 1.18 MiB | 1.16 MiB |
Inline videoFramesForRendering and replaceRetainedFrame at their only call sites and drop the one-line frame(_:movedTo:) wrapper. No behavior change.
Inline appendLastFrameUntilVideoEnd, cancelWriting, presentationFrameCount, presentationFrameIndex, processCurrentFrame, processUnreadableFrame, and the appendLastFrame(date) overload into their single call sites. Keeps appendLastFrame(untilFrameIndex:), append(image:forFrame:), and handleAppendResult which have multiple callers.
Use distinct timestamps so the gap-filling logic is actually exercised, and update assertions to expect the gap-fill frame.
philprime
left a comment
There was a problem hiding this comment.
Good progress, left a couple of comments to discuss
| return onCompletion(.failure(videoWriter.error ?? SentryOnDemandReplayError.errorRenderingVideo)) | ||
| } | ||
| guard frameIndex < videoFrames.count else { | ||
| if let videoEnd = videoEnd, let videoStart = videoStart { |
There was a problem hiding this comment.
m: Please add a comment block here explaining the reasoning for this handling, especially what it means if videoEnd and videoStart are given vs. when they are absent. I don't see any documentation explaining this to future maintainers
|
|
||
| private func handleAppendResult( | ||
| _ result: AppendFrameResult, | ||
| onCompletion: @escaping (Result<SentryRenderVideoResult, Error>) -> Void |
There was a problem hiding this comment.
m: Even though this is a refactoring, I don't see a reason why this method needs an async completion handler. Couldn't this be a throwing method with a tuple of two values instead?
There was a problem hiding this comment.
switched to a throwing method + do-catch in 7212439
| } | ||
| } | ||
|
|
||
| private func appendLastFrame(untilFrameIndex targetFrameIndex: Int, videoWriterInput: AVAssetWriterInput) -> AppendFrameResult { |
There was a problem hiding this comment.
h: I am a bit confused here. The method indicates that it just appends a single frame, but the logic is a loop so it can append multiple ones. Is this method repeating the last frame until the target index is reached?
There was a problem hiding this comment.
renamed to repeatLastFrame + added a doc comment in f28011f
| return retained.time > current.time ? retained : current | ||
| } | ||
|
|
||
| private func removeFrameFile(_ frame: SentryReplayFrame) { |
There was a problem hiding this comment.
m: This seems to be a duplicate of the SentryVideoFrameProcessor extension introduced in this PR, can this be deduplicated?
| } | ||
|
|
||
| deinit { | ||
| let retainedFrame = retainedFrameLock.synchronized { |
There was a problem hiding this comment.
m: did not verify it, but we can not rely on cleanup in deinit for crash cases. What happens if the app restarts after a crash and the retained frame was not cleaned up? Do we need different handling between crash recovery and clean shutdown?
There was a problem hiding this comment.
added a comment explaining it in 48952c8. For the crash case it'd be handled by the crash recovery codepath (we don't need the retainedFrame there), and after sending the last remaining segment, it would be deleted as part of the regular cleanup after recovery.
| public var oldestFrameDate: Date? { | ||
| return _frames.first?.time | ||
| var oldestFrameDate: Date? | ||
| processingQueue.dispatchSync { |
There was a problem hiding this comment.
h: Are certain that this is not called on the main thread? Because a synchronized dispatch from main can cause performance issues
There was a problem hiding this comment.
Reverted this to how it is on main in 20ff61c. Also changed from public to internal as it's only used in one codepath -- upon crash recovery. We don't need to synchronize there because we call this before any of the frames are being processed and never again. Also, no need for setting the retained frame -- If the retained PNG survived the crash, loadFrames already puts it into _frames, sorted by timestamp. Setting a separate retained field would be duplicate state for no gain.
Document why bounded replay segments repeat the last captured frame until the requested video end.
Make append result handling return a boolean or throw, instead of taking the video completion callback.
Name the helper for what it does: repeat the last appended frame until the video reaches a target output index.
Replace duplicate replay frame and empty video deletion helpers with a single file removal helper.
Document why clean shutdown removes the retained frame file while crash recovery leaves it for the next launch.
Read loaded replay frames directly during recovery because recording has not started yet.

Extracts the replay video-assembly fixes from #7851 into their own PR so they can be reviewed and land first; #7851 (the capture-scheduler/backoff rework) builds on top of them and will be retargeted onto this branch. Only
SentryOnDemandReplayandSentryVideoFrameProcessor(plus their tests) change; the public surface of both is untouched, and nothing here references the new capture pacing.These fixes were motivated by the backoff work, which makes irregular capture cadence the norm — but the conditions they handle already occur on
mainwhen captures are skipped under load, and two of them are flat-out pre-existing issues. Behavior changes, all in how segment videos are assembled from captured frames:testSessionReplayForCrashexpectation).oldestFrameDatereads are serialized on the processing queue to avoid racing frame mutations.Review the
SentryVideoFrameProcessordiff first (frame-index based appending is the core change), thenSentryOnDemandReplay(window selection and retained-frame bookkeeping). Verified with the four session-replay test suites (116 tests) against main's display-link capture path.