Add _ReducedSearchSpace for parameter-only reduced search spaces#819
Add _ReducedSearchSpace for parameter-only reduced search spaces#819kalama-ai wants to merge 7 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
This PR introduces a private, lightweight _ReducedSearchSpace to support GP kernel factory calls on “parameter-only” search spaces (e.g., with some parameters stripped) without rebuilding expensive discrete Cartesian products, via SearchSpace._without_parameters(names).
Changes:
- Added
SearchSpace._without_parameters(...)and_ReducedSearchSpace(parameter-onlySearchSpacesubclass) inbaybe/searchspace/core.py. - Added a dedicated test suite validating exposed/blocked attributes and a kernel split/product equivalence check in
tests/test_searchspace.py. - Added a TODO in the GP surrogate fit path about validating factory return types when operating on reduced search spaces.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
baybe/searchspace/core.py |
Adds _without_parameters and new _ReducedSearchSpace with allowlisted attribute access and parameter-derived properties. |
tests/test_searchspace.py |
Adds tests for reduced search space behavior and kernel factory interoperability (active dims consistency). |
baybe/surrogates/gaussian_process/core.py |
Adds a TODO note about validating kernel factory outputs when reduced search spaces are used. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @override | ||
| def __getattribute__(self, name: str): | ||
| """Guard attribute access, allowing only parameter-related attributes.""" | ||
| if name.startswith("__"): | ||
| return object.__getattribute__(self, name) | ||
| allowed = object.__getattribute__(self, "_ALLOWED_ATTRIBUTES") | ||
| if name in allowed: | ||
| return object.__getattribute__(self, name) | ||
| raise NotImplementedError( | ||
| f"'{object.__getattribute__(self, '__class__').__name__}' does not " | ||
| f"support attribute '{name}'. Only parameter information is available." | ||
| ) |
| names | ||
| ) + self.continuous.get_parameters_by_name(names) | ||
|
|
||
| def _without_parameters(self, names: Collection[str], /) -> _ReducedSearchSpace: |
There was a problem hiding this comment.
I'm very open to discussing the method names once we have a truly public interface, but for now I suggest to rename it to _drop_parameters, since we already have it on other classes as well
There was a problem hiding this comment.
you need to also update the PR description, which still contains the old name. I guess there are now also other things to be changed in the description, since your latest commits?
AVHopp
left a comment
There was a problem hiding this comment.
I am in general happy with the PR, mostly minor comments (and one where we might want to discuss a bit)
| Returns: | ||
| A reduced search space containing only parameter information. | ||
| """ | ||
| current_names = {p.name for p in self.parameters} |
There was a problem hiding this comment.
Couldn't you also precompute set_names = set(names) here to avoid the double calculation in both the if and the calculation of remaining?
| allowed = object.__getattribute__(self, "_ALLOWED_ATTRIBUTES") | ||
| if name in allowed: | ||
| return object.__getattribute__(self, name) | ||
| raise NotImplementedError( |
There was a problem hiding this comment.
Shouldn't this rather raise an AttributeError?
| ) | ||
|
|
||
| ### Kernel | ||
| # TODO: When calling a factory on a `_ReducedSearchSpace`, validate that |
There was a problem hiding this comment.
Just to be sure, this is intended to be done in a follow-up PR I assume and not in this one?
There was a problem hiding this comment.
Yes, this was only meant to highlight where the new _ReducedSearchSPaceclass would be used for the first time. Our plan is to pass it only to BayBE factories and block gpytorch kernels. This is why the current allow list in the class blocks everything index related.
| reduced = searchspace._drop_parameters({"Task"}) | ||
| assert reduced.n_tasks == 1 | ||
|
|
||
| def test_get_n_comp_rep_columns(self, searchspace): |
There was a problem hiding this comment.
Wouldn't it make sense to also add a ParameterSelctor-based test and not only str-based ones?
| def test_blocked_attributes(self, searchspace): | ||
| """Verify that accessing non-parameter attributes raises an error.""" | ||
| reduced = searchspace._drop_parameters({"Task"}) | ||
| with pytest.raises(NotImplementedError): |
There was a problem hiding this comment.
Is there any programmatic way of accessing all of these fields while also ensuring that we actually get all? If we'd now add a new field to SearchSpace, one would need to remember that this test here should also be changed. I would imagine some mechanism to go through all fields that are not part of the allowed list, verify that trying to access them raises an error and check whether any of the fields has did not raise an error.
| assert np.isclose(interpoint_constraint_result, 0.6, atol=1e-6) | ||
|
|
||
|
|
||
| class TestReducedSearchSpace: |
There was a problem hiding this comment.
We usually don't use test classes, so I would prefer if this follows the same style that we use in other places as well (looping in @AdrianSosic since he might say "No, this is the better approach, we should keep it and not use the bad one that we use everywhere else just for consistency!")
There was a problem hiding this comment.
so far we haven't used them. Perhaps we will in the future, perhaps not 🙃
There was a problem hiding this comment.
But you know what: we need an easy way to remove all the corresponding tests later, without having to read each test body, so some grouping is definitely needed. So can you either:
- go back to the class approach or
- more the tests to a dedicated test file
|
|
||
|
|
||
| @define(slots=False) | ||
| class _ReducedSearchSpace(SearchSpace): |
There was a problem hiding this comment.
Do we think that inheritance is the way to go here? This implementation clearly violates LSP, and we could probably also use Protocol instead. I am not saying that we should - in fact, my gut feeling is that using the inheritance here is the better way right now, but I want to at least open up the discussion and make sure that we all are happy with this design. Hence also looping in @Scienfitz and @AdrianSosic for their opinions :)
There was a problem hiding this comment.
Fair point. But tbh I this is hopefully only a temporary solution and I think we can live with it.
There was a problem hiding this comment.
Have also talked with @kalama-ai offline about a related issue, but my opinion: this is clearly a hack anyway and not a first-class new feature, so do whatever it takes to bring you to the goal as quickly as possible 😬
- Introduce _ReducedSearchSpace subclass with subset of accessible attribute - Add SearchSpace._without_parameters() factory method - Override parameters, parameter_names, comp_rep_columns, task_idx - Add TODO in GaussianProcessSurrogate._fit for future dispatching - Add tests including kernel product integration test
- Replace `task_idx` sentinel checks with `n_tasks == 1 / > 1` across kernel factories and fit criterion (equivalent because `TaskParameter` enforces min 2 values) - Add `SearchSpace._get_n_comp_rep_columns(selector)` for counting number of indices in comp representation - Remove `task_idx` and `get_comp_rep_parameter_indices` from the reduced searchspace allowlist since these return integer indices that are only meaningful relative to the reduced space, not the full training tensor - Override `_get_n_comp_rep_columns` on `_ReducedSearchSpace` to avoid calling the blocked `get_comp_rep_parameter_indices`
Co-authored-by: AdrianSosic <adrian.sosic@merckgroup.com>
Co-authored-by: AdrianSosic <adrian.sosic@merckgroup.com>
- Build real SubspaceDiscrete/SubspaceContinuous objects with zero rows instead of storing a flat _parameters tuple - Remove _parameters field, all property overrides, __attrs_post_init, override, and _get_n_comp_rep_columns override - Convert TestReducedSearchSpace class to standalone test functions - Add programmatic test that discovers all SearchSpace attributes and verifies non-allowed ones raise NotImplementedError
e341f4a to
fa31cc4
Compare
| "_task_parameter", | ||
| "n_tasks", | ||
| "_get_n_comp_rep_columns", | ||
| "get_comp_rep_parameter_indices", |
There was a problem hiding this comment.
Now you have added this one here to the list, which shouldn't be there. I know, _get_n_comp_rep_columns needs it, but you can work around that in its implementation
Introduces
_ReducedSearchSpace(privateSearchSpacesubclass) that exposes only parameter information without building the expensive discrete Cartesian product. This is constructed viaSearchSpace._without_parameters(names), which returns a reduced version with the specified parameters removed.The class blocks access to anything beyond parameter-related properties.
Motivation
When building composite GP kernels, we need to call kernel factories on search spaces with certain parameters stripped away. A full
SearchSpace.from_product()would rebuild the entire discrete candidate set._ReducedSearchSpaceavoids this by storing the parameter tuple directly and computing everything from it.Changes to
SearchSpaceand kernel factoriesThe reduced search space exposes only name-based properties, no integer indices. To support this, we made two changes to existing code:
task_idxsentinel checks withn_tasks > 1/n_tasks == 1in kernel factories and fit criterion factories. (Equivalent becauseTaskParameterrequires a minimum of 2 values).SearchSpace._get_n_comp_rep_columns(selector)that returns the number of comp-rep columns for a parameter selection. This replaces the previouslen(get_comp_rep_parameter_indices(...))in_get_effective_dimensionality, avoiding the need to expose index-returning methods on the reduced space.Design of
_ReducedSearchSpaceSearchSpaceso it passes type checks and works with existing factory signaturestuple[Parameter, ...]and derivesparameters,parameter_names,comp_rep_columnsfrom it__getattribute__with an allowlist — accessingtransform,constraints,discrete,continuous, etc. raisesNotImplementedError__attrs_post_init__Allowed attributes
Outlook
This class is a building block for kernel override dispatching in the GP surrogate. The dispatching logic will: