Device selection criteria for usecase-driven scenarios

WebNN has been moving away from explicit device selection (explicitly stating the device kind - GPU/NPU/CPU) more in favor describing the *intent* and expectations of the workload (currently limited to `MLPowerPreference` and whether to request an `MLContextOptions::accelerated` device), but there may be [more complex scenarios where we want to be *more explicit*](https://github.com/webmachinelearning/webnn/pull/895#issuecomment-3501326255), or cases where we want to be *less explicit* to let WebNN or the backends choose the device based on intent.

For example, some independent attributes that could contribute to device selection include aspects like: how big a workload is, how immediately you want the result (a GPU can finish a large load faster than the CPU, but when desiring small latency with small workloads, the CPU is often better), whether it's continuous repeating work or single shot, desired power usage...

- **acceptable latency**: low, medium, high
- **continuity**: continuous, moderate, single shot
- **workload size**: small, medium, large
- **power usage**: low, medium, high
- ...

e.g.
| App scenarios             | preferred latency | workload size   | continuity | ideal device
|---------------------------|-------------------|-----------------|------------|--------------
| Realtime audio filtering  | low               | small | continuous | CPU/NPU (not GPU due to overhead of readback time)
| Audio to text             | ignorable         | medium | continuous | Any
| Realtime video processing | low               | large | continuous | GPU (since likely to be displayed afterward anyway)
| Offline video processing  | ignorable         | large | continuous | GPU/NPU (since large continuous workload)
| Image generation          | medium            | large | multiple executions | GPU/NPU (much faster than CPU)
| Image recognition         | ignorable         | small | single execution | Any (CPU is fast enough for simple recognition)
| Code editor prediction    | low               | small  | single execution | CPU since small workload and want to avoid pegging the GPU while typing

(conceptually it's kind of like the font selection problem - family & weight & width & slope ...)

Seeding this discussion @anssiko for user feedback... 📋

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Device selection criteria for usecase-driven scenarios #902

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

App scenarios	preferred latency	workload size	continuity	ideal device
Realtime audio filtering	low	small	continuous	CPU/NPU (not GPU due to overhead of readback time)
Audio to text	ignorable	medium	continuous	Any
Realtime video processing	low	large	continuous	GPU (since likely to be displayed afterward anyway)
Offline video processing	ignorable	large	continuous	GPU/NPU (since large continuous workload)
Image generation	medium	large	multiple executions	GPU/NPU (much faster than CPU)
Image recognition	ignorable	small	single execution	Any (CPU is fast enough for simple recognition)
Code editor prediction	low	small	single execution	CPU since small workload and want to avoid pegging the GPU while typing

Device selection criteria for usecase-driven scenarios #902

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions