Skip to content

PWA: offline reload fails with '98% + unhandled error' on production #899

Description

@codemonkey85

Summary

Reloading the app while offline fails with An unhandled error has occurred after reaching ~98% on the splash. The service worker pre-caches assets but offline navigation does not serve them back correctly. This reproduces on the currently deployed pkmds.app (no AOT, current main), so it is a pre-existing PWA bug, not a regression from #893 (perf/aot-simd).

Reproduction

  1. Open https://pkmds.app in a fresh browser session (or after clearing site data).
  2. Wait for the app to fully load and become interactive.
  3. Go offline (airplane mode on mobile, or DevTools → Network → Offline on desktop).
  4. Hard refresh the page.

Expected: the cached app loads from the service worker and renders the welcome screen.

Actual: splash bar reaches ~98% then An unhandled error has occurred. Reload. On Android Chrome the navigation does not reach the SW at all and Chrome shows its standard ERR_INTERNET_DISCONNECTED page.

Evidence

Confirmed on three platforms against current production (pkmds.app, no AOT, main):

  • Desktop Chrome (DevTools throttling = Offline, Disable cache checked): splash → 98% → "An unhandled error has occurred. Reload". 0 / 3649 requests succeeded after going offline; prior online load showed 34.6 MB transferred / 112.8 MB resources, so caching did happen, just not serving back.
  • iOS Safari (airplane mode + reload): same 98% + unhandled-error pattern. Reproduced both after a normal update flow and after a full Safari "Clear Website Data" → online load → offline reload.
  • Android Chrome (airplane mode + reload): Chrome's offline dinosaur page with ERR_INTERNET_DISCONNECTED, meaning the SW did not intercept the navigation at all.

Hypotheses to investigate

  • SW lifecycle timing: install/activation may not complete before the user reloads (pre-cache is 30+ MB). Worth instrumenting installing / installed / activating / activated transitions.
  • The framework's bootstrap fetch chain may issue a request the SW does not have in its keyed cache, possibly due to query-string mismatch between service-worker-assets.js integrity URLs and runtime fetches.
  • Android Chrome's failure mode (dino page, no SW intercept) differs from iOS Safari's (SW intercepts most, fails late at 98%) — they may be two separate bugs, or one root cause manifesting differently per platform.

Out of scope

This is not caused by #883 / #893. The AOT+SIMD branch makes pre-cache larger (~42 MB) but the underlying bug is independent: reproduces on production today.

Suggested next steps

  1. Add structured logging to service-worker.published.js for lifecycle events + each fetch handler resolution path.
  2. Capture a SW + Network trace from a real failing reload (desktop Chrome with DevTools open across the offline reload).
  3. Compare against an older deploy if available — was offline reload ever working on this repo, or has it always been flaky?

Related: #883 (AOT/SIMD perf work) — surfaced this while testing offline behavior.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions