Skip to content

fix: prevent session replay screenshot PII leak on PixelCopy timeout#471

Draft
marandaneto wants to merge 4 commits intomainfrom
fix/replay-screenshot-pii-leak-on-timeout
Draft

fix: prevent session replay screenshot PII leak on PixelCopy timeout#471
marandaneto wants to merge 4 commits intomainfrom
fix/replay-screenshot-pii-leak-on-timeout

Conversation

@marandaneto
Copy link
Copy Markdown
Member

@marandaneto marandaneto commented Mar 30, 2026

💡 Motivation and Context

Fixes #450 — Session replay screenshots intermittently sent with literally nothing masked, leaking PII.

Root cause analysis

findMaskableWidgets ran on the PixelCopy callback thread (background), causing three classes of failures:

  1. Thread safety: View properties (text, hint, visibility, alpha, childCount, getGlobalVisibleRect, etc.) were read from a background thread. On Android, the UI toolkit is single-threaded — reading from another thread is undefined behavior that can return stale or inconsistent values.

  2. Silent mask rect drops: isViewStateStableForMatrixOperations() returns false during animations, layout passes, or transient state (common in React Native bridge updates). This caused globalVisibleRect() to return null, and the mask rect was silently not added — no log, no error. The view IS visible on screen (PixelCopy captured it) but no mask is applied. When ALL maskable views happen to be in transient state, the entire screenshot goes out unmasked.

  3. Subtree skipping: isVisible() relies on hasGlobalVisibleRect() which also fails during transient state. Entire view subtrees were skipped in the traversal even though PixelCopy captured them.

  4. Latch timeout PII leak: If the PixelCopy callback took >1s (e.g. due to Compose semantics traversal), latch.await() timed out but success was still true, encoding an unmasked bitmap.

Approach

Instead of traversing the view tree on a background thread AFTER PixelCopy, we now:

  1. Collect mask rects on the main thread BEFORE PixelCopy (thread-safe, stable view state)
  2. Reset isOnDrawnCalled right before PixelCopy.request() to keep the detection window tight (~1 vsync)
  3. PixelCopy captures the screen
  4. Callback checks isOnDrawnCalled — if the screen changed during capture, discard
  5. Apply pre-computed mask rects to the bitmap

💚 How did you test it?

  • Code review and static analysis of all concurrency paths
  • Verified compilation (./gradlew :posthog-android:compileReleaseKotlin)
  • Ran JVM test suite (make testJava) — all pass

📝 Checklist

  • I reviewed the submitted code.
  • I added tests to verify the changes.
  • I updated the docs if needed.
  • No breaking change or entry added to the changelog.

If releasing new changes

  • Ran pnpm changeset to generate a changeset file
  • Added the "release" label to the PR to indicate we're publishing new versions for the affected packages

- Check latch.await() return value so unmasked bitmaps are never
  encoded when the callback times out
- Replace forEach+return@forEach with for+break so the masking loop
  exits immediately on screen changes
@marandaneto marandaneto requested a review from a team as a code owner March 30, 2026 14:31
}

maskableWidgets.forEach {
for (rect in maskableWidgets) {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a bug but wasting cycles

try {
// await for 1s max
latch.await(1000, TimeUnit.MILLISECONDS)
val completed = latch.await(1000, TimeUnit.MILLISECONDS)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the PixelCopy callback can take longer than the 1-second latch timeout

very unlikely but doing the change anyway

@marandaneto marandaneto enabled auto-merge (squash) March 30, 2026 14:34
@marandaneto
Copy link
Copy Markdown
Member Author

wondering if it helps with #450

Root cause of #450: findMaskableWidgets ran on the PixelCopy callback
thread, causing three classes of PII leaks:

1. Thread safety: View properties (text, visibility, layout) were read
   from a background thread — undefined behavior on Android that can
   return stale/inconsistent values.

2. Silent rect drops: isViewStateStableForMatrixOperations() returns
   false during animations/layout passes, causing globalVisibleRect()
   to return null. The mask rect was silently not added — the view IS
   visible on screen (PixelCopy captured it) but no mask is applied.

3. Subtree skipping: isVisible() relies on hasGlobalVisibleRect() which
   also fails during transient state. Entire view subtrees were skipped
   even though PixelCopy captured them.

Changes:
- Move findMaskableWidgets to main thread (via mainHandler.post) before
  PixelCopy capture. This ensures thread-safe property access and views
  are in a stable state during traversal.
- Add globalVisibleRectForMasking() that falls back to getLocationOnScreen
  + width/height when the strict stability check fails.
- Add isLikelyVisibleForMasking() that checks alpha/visibility without
  requiring hasGlobalVisibleRect(), preventing subtree skipping.
- Add getTextAreaGlobalVisibleRectForMasking() with same fallback.
- Fix findMaskableComposeWidgets to detect when already on main thread
  and run inline, preventing deadlock.
- Reset isOnDrawnCalled on main thread after rect collection to detect
  screen changes between rect collection and pixel capture.
@marandaneto marandaneto marked this pull request as draft March 30, 2026 14:49
auto-merge was automatically disabled March 30, 2026 14:49

Pull request was converted to draft

Resetting the flag on the main thread after rect collection created a
wide gap (thread switch + bitmap allocation) during which normal draws
(cursor blinks, animations) would always set it to true, causing nearly
every screenshot to be discarded.

Move the reset to right before PixelCopy.request() on the executor
thread, restoring the original ~1 vsync detection window.
Now that findMaskableWidgets runs on the main thread, these checks are
harmful: isOnDrawnCalled is still true from the draw that triggered the
snapshot, causing the method to always return false and bail early.

On the main thread, onDraw cannot fire during traversal (we're blocking
the looper), so the checks are unnecessary. Screen-change detection is
handled by the isOnDrawnCalled check in the PixelCopy callback.

Also changed return type to Unit since the method no longer signals
early abort.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Android session replay ignores masking on TextInput (ph-mask, maskAllTextInputs, PostHogMaskView all ineffective)

1 participant