Skip to content

No mechanism to navigate large/infinite spaces #798

@AdrianSosic

Description

@AdrianSosic

Problem

There is no abstraction for a recommender to request a tractable subset of a large candidate space, or to construct candidates from an infinite one.

Currently, every recommender receives the full materialized exp_rep and comp_rep and must work with all of it. The API provides no way to express:

  • "Give me 500 diverse candidates from this 10^9-row space"
  • "Generate 1000 random valid candidates from this infinite space"
  • "Subsample to 10,000 candidates, then pick 50 via farthest-point sampling"

The recommender has no concept of a selection strategy or a construction strategy. It simply gets the entire space and must deal with it.

Why it matters

  • Even if the space were lazily represented (sub-issue Eager search space materialization #796), without a navigation mechanism the recommender would still need to materialize everything to work with it. Lazy production without tractable consumption still leads to full materialization at the point of use.
  • Large-space optimization is impractical. A BoTorch recommender computing acquisition values over 10^6 candidates is computationally infeasible. There is no built-in way to pre-screen, subsample, or strategically select a working set.
  • Infinite-space optimization is impossible. Non-enumerable spaces (string-valued parameters, hierarchical parameters, combinatorial libraries) cannot produce candidates at all — there is no mechanism to generate a finite set from an infinite domain.
  • Different recommenders need different strategies. A RandomRecommender doesn't need FPS pre-screening; a BotorchRecommender on a million-point space might want aggressive subsampling. The current design cannot express this per-recommender variation.

Related

  • Sub-issue Eager search space materialization #796 — lazy candidates solve the production side; this issue is about the consumption side. These are co-dependent: laziness without selection still forces eventual full materialization; selection without laziness can't exploit streaming.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions