Skip to content

Flutter iOS E2E: random per-session CoreSimulator simctl-launch wedge (quarantined in CI) #421

@goosewobbler

Description

@goosewobbler

Summary

E2E - Flutter [iOS] is flaky on GitHub-Actions macOS runners due to a random, per-session CoreSimulator app-launch wedge, not a cold-start/warmth problem. It is quarantined (removed from the CI Status gate's needs in .github/workflows/ci.yml) as of #398/#387 — the job still runs and is visible on PRs but does not block. RN iOS stays gated, so iOS-mobile coverage isn't lost.

Symptom

On some runs, 1–3 specs fail at session-create, ~7 min each, with either:

  • WebDriverError: The operation was aborted due to timeout on POST /session (the wdio client aborts at connectionRetryTimeout), and in the appium log:
    Command 'xcrun simctl launch --terminate-running-process <udid> com.facebook.WebDriverAgentRunner.xctrunner' timed out after 600000ms (also seen on simctl terminate), or
  • XCTDaemonErrorDomain Code=5 "Timed out attempting to launch app."

appium-xcuitest launches the preinstalled WDA via simctl launch per session (appium-webdriveragent launchWithPreinstalledWDA). That simctl exec intermittently wedges the full 600 s node-simctl timeout; appium:wdaLaunchTimeout does not bound it and there is no cap to shorten it.

What was ruled out (PR #398/#387)

  • usePreinstalledWDA (vs usePrebuiltWDA/xcodebuild): fixed RN iOS and removed the in-session compile, but Flutter iOS still per-session-launches the WDA xctrunner.
  • Stripping embedded Frameworks/XC*: a real-device-only step; on the simulator it makes the runner crash on launch (SBMainWorkspace denial). Removed.
  • skipLogCapture + appium:dartVmServicePort pin (to drop the concurrent simctl log stream): non-viable — the iOS Flutter engine ignores the VM-service port pin (documented in e2e/wdio.flutter.conf.ts), so syslog is the only iOS VM-discovery path and can't be disabled.
  • SpringBoard / WDA-launch warm-up, incl. a blocking gate that required a fast simctl launch before grading: the gate confirmed "warm" (round 1) yet grading still wedged 2 specs — warmth does not predict graded-session success, confirming the wedge is random per-session.
  • specFileRetries (+ deferred): a session-create timeout kills the worker in a way WDIO does not re-run (verified: wedged spec reported failed with no extra worker run).

Proposed fix (the reason this is tracked, not abandoned)

External WDA: launch WebDriverAgent once in the workflow (replicating appium's exact env — USE_PORT + WDA_PRODUCT_BUNDLE_IDENTIFIER), retry until its /status answers, keep it resident, and attach every session via appium:webDriverAgentUrl. This eliminates the per-session simctl launch entirely, so the random wedge can't hit session-create. A first attempt failed on the standalone-WDA-serving detail (the manual simctl launch didn't bring up the :8100 server) — needs the env replicated correctly + a retry-until-serving loop.

Once external-WDA is reliably green across several cold-sim runs, re-add e2e-flutter-ios-macos to ci-status.needs to un-quarantine.

Scope note

The Flutter iOS package smoke runs inside the same job (after the e2e suite), so it is also non-gating while quarantined. RN iOS (both arches) + its package smoke, Flutter Android, and RN Android remain gated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:ciUpdates to Continuous Integrationarea:testsUpdates to testsscope:flutterFlutter service and wdio_flutter Dart contracttype:bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions