Skip to content

fix(session-replay): Stabilize video assembly#8041

Open
romtsn wants to merge 16 commits into
mainfrom
fix/replay-video-assembly
Open

fix(session-replay): Stabilize video assembly#8041
romtsn wants to merge 16 commits into
mainfrom
fix/replay-video-assembly

Conversation

@romtsn

@romtsn romtsn commented Jun 11, 2026

Copy link
Copy Markdown
Member

Extracts the replay video-assembly fixes from #7851 into their own PR so they can be reviewed and land first; #7851 (the capture-scheduler/backoff rework) builds on top of them and will be retargeted onto this branch. Only SentryOnDemandReplay and SentryVideoFrameProcessor (plus their tests) change; the public surface of both is untouched, and nothing here references the new capture pacing.

These fixes were motivated by the backoff work, which makes irregular capture cadence the norm — but the conditions they handle already occur on main when captures are skipped under load, and two of them are flat-out pre-existing issues. Behavior changes, all in how segment videos are assembled from captured frames:

  • The last frame captured before a segment window is retained, so recovered and resumed segments start with the correct screen state instead of dropping it. A recovered crash replay's start timestamp now reflects that retained frame (see the updated testSessionReplayForCrash expectation).
  • Fractional gaps between captured frames are filled (and bounded) so video timing stays stable when captures are skipped.
  • Video windows are half-open, so consecutive segments no longer duplicate the boundary frame.
  • Segments that end up with no frames are dropped instead of being emitted as empty videos.
  • Video timing stays stable when a cached frame image cannot be read back from disk.
  • oldestFrameDate reads are serialized on the processing queue to avoid racing frame mutations.

Review the SentryVideoFrameProcessor diff first (frame-index based appending is the core change), then SentryOnDemandReplay (window selection and retained-frame bookkeeping). Verified with the four session-replay test suites (116 tests) against main's display-link capture path.

Extract the replay video-assembly fixes from the capture-backoff
branch so they can land and ship independently:

- Retain the last frame captured before a segment window so recovered
  and resumed segments start with the correct screen state instead of
  dropping it (this moves the recovered replay start timestamp to the
  retained frame's time).
- Fill and bound fractional gaps between captured frames so video
  timing stays stable when captures are skipped under load.
- Use half-open video windows to avoid duplicating boundary frames in
  consecutive segments.
- Drop video segments that contain no frames instead of emitting empty
  videos.
- Keep video timing stable when a cached frame image cannot be read.
- Serialize oldestFrameDate reads on the processing queue.
@romtsn romtsn force-pushed the fix/replay-video-assembly branch from 5b86990 to cb8565a Compare June 11, 2026 12:10
@codecov

codecov Bot commented Jun 11, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 96.73203% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.633%. Comparing base (70e2dea) to head (4ddd96e).
⚠️ Report is 1 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
...ions/SessionReplay/SentryVideoFrameProcessor.swift 96.774% 3 Missing ⚠️
...egrations/SessionReplay/SentryOnDemandReplay.swift 96.610% 1 Missing and 1 partial ⚠️
Additional details and impacted files

Impacted file tree graph

@@              Coverage Diff              @@
##              main     #8041       +/-   ##
=============================================
+ Coverage   87.601%   87.633%   +0.031%     
=============================================
  Files          559       559               
  Lines        32174     32289      +115     
  Branches     13158     13223       +65     
=============================================
+ Hits         28185     28296      +111     
- Misses        3941      3944        +3     
- Partials        48        49        +1     
Files with missing lines Coverage Δ
...grations/SessionReplay/SessionReplayRecovery.swift 91.150% <100.000%> (ø)
...egrations/SessionReplay/SentryOnDemandReplay.swift 93.379% <96.610%> (+1.134%) ⬆️
...ions/SessionReplay/SentryVideoFrameProcessor.swift 95.789% <96.774%> (+0.099%) ⬆️

... and 6 files with indirect coverage changes


Continue to review full report in Codecov by Harness.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 70e2dea...4ddd96e. Read the comment docs.

@romtsn romtsn marked this pull request as ready for review June 11, 2026 17:15

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 6f75b41. Configure here.

Comment thread CHANGELOG.md Outdated
Removed merge conflict markers and updated changelog for version 9.17.1.
@romtsn romtsn added the ready-to-merge Use this label to trigger all PR workflows label Jun 11, 2026
@sentry

sentry Bot commented Jun 11, 2026

Copy link
Copy Markdown

📲 Install Builds

iOS

🔗 App Name App ID Version Configuration
SDK-Size io.sentry.sample.SDK-Size 9.17.1 (1) Release

⚙️ sentry-cocoa Build Distribution Settings

@github-actions

github-actions Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Performance metrics 🚀

  Plain With Sentry Diff
Startup time 1234.65 ms 1258.87 ms 24.22 ms
Size 24.14 KiB 1.18 MiB 1.16 MiB

Baseline results on branch: main

Startup times

Revision Plain With Sentry Diff
909f4ae 1222.85 ms 1253.26 ms 30.41 ms
5a9c804 1220.29 ms 1255.17 ms 34.88 ms
f213b0b 1220.62 ms 1251.59 ms 30.97 ms
466b8d9 1227.17 ms 1252.85 ms 25.68 ms
36c25b3 1229.10 ms 1255.83 ms 26.73 ms
8bb6aeb 1215.82 ms 1244.84 ms 29.02 ms
58e87fd 1234.27 ms 1262.09 ms 27.82 ms
e56c394 1229.37 ms 1245.86 ms 16.49 ms
4c1337a 1230.65 ms 1252.98 ms 22.33 ms
39655d1 1224.05 ms 1258.38 ms 34.33 ms

App size

Revision Plain With Sentry Diff
909f4ae 24.14 KiB 1.16 MiB 1.13 MiB
5a9c804 24.14 KiB 1.15 MiB 1.13 MiB
f213b0b 24.14 KiB 1.17 MiB 1.14 MiB
466b8d9 24.14 KiB 1.18 MiB 1.15 MiB
36c25b3 24.14 KiB 1.18 MiB 1.16 MiB
8bb6aeb 24.14 KiB 1.17 MiB 1.15 MiB
58e87fd 24.14 KiB 1.17 MiB 1.14 MiB
e56c394 24.14 KiB 1.16 MiB 1.13 MiB
4c1337a 24.14 KiB 1.17 MiB 1.15 MiB
39655d1 24.14 KiB 1.16 MiB 1.14 MiB

Previous results on branch: fix/replay-video-assembly

Startup times

Revision Plain With Sentry Diff
f98aa1c 1229.22 ms 1259.48 ms 30.26 ms
4832942 1230.43 ms 1265.88 ms 35.45 ms
e8e4651 1217.60 ms 1247.39 ms 29.79 ms
f17e4d9 1224.52 ms 1255.33 ms 30.81 ms
104ce0c 1215.65 ms 1245.04 ms 29.39 ms

App size

Revision Plain With Sentry Diff
f98aa1c 24.14 KiB 1.18 MiB 1.16 MiB
4832942 24.14 KiB 1.18 MiB 1.16 MiB
e8e4651 24.14 KiB 1.18 MiB 1.16 MiB
f17e4d9 24.14 KiB 1.18 MiB 1.16 MiB
104ce0c 24.14 KiB 1.18 MiB 1.16 MiB

Inline videoFramesForRendering and replaceRetainedFrame at their only
call sites and drop the one-line frame(_:movedTo:) wrapper. No behavior
change.
Inline appendLastFrameUntilVideoEnd, cancelWriting,
presentationFrameCount, presentationFrameIndex, processCurrentFrame,
processUnreadableFrame, and the appendLastFrame(date) overload into
their single call sites. Keeps appendLastFrame(untilFrameIndex:),
append(image:forFrame:), and handleAppendResult which have multiple
callers.
romtsn added 2 commits June 12, 2026 12:15
Use distinct timestamps so the gap-filling logic is actually
exercised, and update assertions to expect the gap-fill frame.

@philprime philprime left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good progress, left a couple of comments to discuss

return onCompletion(.failure(videoWriter.error ?? SentryOnDemandReplayError.errorRenderingVideo))
}
guard frameIndex < videoFrames.count else {
if let videoEnd = videoEnd, let videoStart = videoStart {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m: Please add a comment block here explaining the reasoning for this handling, especially what it means if videoEnd and videoStart are given vs. when they are absent. I don't see any documentation explaining this to future maintainers

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added in e10075d


private func handleAppendResult(
_ result: AppendFrameResult,
onCompletion: @escaping (Result<SentryRenderVideoResult, Error>) -> Void

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m: Even though this is a refactoring, I don't see a reason why this method needs an async completion handler. Couldn't this be a throwing method with a tuple of two values instead?

@romtsn romtsn Jun 15, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

switched to a throwing method + do-catch in 7212439

}
}

private func appendLastFrame(untilFrameIndex targetFrameIndex: Int, videoWriterInput: AVAssetWriterInput) -> AppendFrameResult {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

h: I am a bit confused here. The method indicates that it just appends a single frame, but the logic is a loop so it can append multiple ones. Is this method repeating the last frame until the target index is reached?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed to repeatLastFrame + added a doc comment in f28011f

return retained.time > current.time ? retained : current
}

private func removeFrameFile(_ frame: SentryReplayFrame) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m: This seems to be a duplicate of the SentryVideoFrameProcessor extension introduced in this PR, can this be deduplicated?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

created a shared helper in b36cc55

}

deinit {
let retainedFrame = retainedFrameLock.synchronized {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m: did not verify it, but we can not rely on cleanup in deinit for crash cases. What happens if the app restarts after a crash and the retained frame was not cleaned up? Do we need different handling between crash recovery and clean shutdown?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a comment explaining it in 48952c8. For the crash case it'd be handled by the crash recovery codepath (we don't need the retainedFrame there), and after sending the last remaining segment, it would be deleted as part of the regular cleanup after recovery.

public var oldestFrameDate: Date? {
return _frames.first?.time
var oldestFrameDate: Date?
processingQueue.dispatchSync {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

h: Are certain that this is not called on the main thread? Because a synchronized dispatch from main can cause performance issues

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted this to how it is on main in 20ff61c. Also changed from public to internal as it's only used in one codepath -- upon crash recovery. We don't need to synchronize there because we call this before any of the frames are being processed and never again. Also, no need for setting the retained frame -- If the retained PNG survived the crash, loadFrames already puts it into _frames, sorted by timestamp. Setting a separate retained field would be duplicate state for no gain.

romtsn added 6 commits June 15, 2026 13:03
Document why bounded replay segments repeat the last captured frame until the requested video end.
Make append result handling return a boolean or throw, instead of taking the video completion callback.
Name the helper for what it does: repeat the last appended frame until the video reaches a target output index.
Replace duplicate replay frame and empty video deletion helpers with a single file removal helper.
Document why clean shutdown removes the retained frame file while crash recovery leaves it for the next launch.
Read loaded replay frames directly during recovery because recording has not started yet.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-to-merge Use this label to trigger all PR workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants