Sample Weights introduction #652

filippozacchei · 2025-10-13T23:32:39Z

Introduction of Weighted Samples in SINDy.fit()

This pull request introduces support for weighted samples in both standard and weak formulations of SINDy.
It enables users to provide per-sample or per-trajectory weights that influence model fitting, improving control over data importance and uncertainty propagation.

Main Changes

`_core._expand_sample_weights`

Introduces a unified weight-expansion utility to validate and broadcast user-provided sample weights.
In standard SINDy, weights are expanded to match the total number of samples (n_samples × 1).
In weak SINDy, each trajectory’s scalar weight is repeated according to the number of weak test functions (`(
(n_trajectories \cdot K) × 1`).
Strict validation ensures that:
- Weak mode accepts exactly one scalar weight per trajectory.
- Standard mode allows either scalars or per-sample arrays.

`weighted_weak_pde_library`

Implements Generalized Least Squares (GLS) whitening for weak-form regression.
When non-uniform spatiotemporal weights are provided, the feature matrix (Θ) and right-hand side (V) are prewhitened using the Cholesky factor of the covariance matrix: $\mathrm{Cov}[V] = V' \Sigma (V')^\top$

where:
- Σ represents the pointwise sample weights.
- V′ are the weak integration matrices for the right-hand side.

Automated Tests

Added two new test suites to validate the weighting behavior.

To do: account for WEAK sindy

Jacob-Stevens-Haas · 2025-10-14T21:00:42Z

Hey, thanks for this PR! Some quick thoughts:

Weak form is going through a big refactor, and the WeakPDELibrary will be deprecated in the next release. In it's place, WeakSINDy will be a subclass of _BaseSINDy. This will allow removing a lot of if weak: that is sprinkled throughout the package, as well as enable prediction and simulation of weak models and pave the way for people to implement different basis functions. IDK how you want to proceed - wait to see what weak branch comes up with in a couple weeks, or just continue with strict "1-weight-per-trajectory" for now so it's the same for weak and regular?
On a related note, why is GLS whitening desired?
I know I steered you away from SampleConcatter before, but maybe calls to _expand_sample_weights should go there, rather than in fit() and score. It might be able to reduce redundant code. It helps if the logic to handle flattening axes only occurs in one place, since otherwise, developers need to keep multiple axis layout conventions in mind.
Is STLSQ the only optimizer that supports sample weight?
Add type annotations wherever you touch a function signature

filippozacchei · 2025-10-17T20:06:31Z

Thanks for your comment.

I would say then to just use sample weights for only standard Sindy then. The GLS whitening is needed for Weak SINDy if we have samples of the same trajectory of different importance, but I would avoid for the time being.

Will look into SampleConcatter and update!

Jacob-Stevens-Haas

Sorry for the long delay! I can finally begin to prioritize this PR. As I mentioned, it's best to leave Weak for another day, and to put the sample array alignment in the same place as trajectories are already being flattened and combined (SampleConcatter). Other than that, there's a few tweaks to _expand... and to the tests

Jacob-Stevens-Haas · 2025-12-08T02:14:48Z

pysindy/_core.py

 from sklearn.pipeline import Pipeline
 from sklearn.utils.validation import check_is_fitted
+
+set_config(enable_metadata_routing=True)


Leave this up to the user. scikit-learn documentation explains that if they're using a pipeline or composite transform, they'll need this.

Jacob-Stevens-Haas · 2025-12-08T02:15:23Z

pysindy/_core.py

+        self.set_fit_request(sample_weight=True)
+        self.set_score_request(sample_weight=True)
+        self.optimizer.set_fit_request(sample_weight=True)


Again, leave this up to the user

Jacob-Stevens-Haas · 2025-12-08T02:15:36Z

pysindy/_core.py

        x_dot=None,
        u=None,
        feature_names: Optional[list[str]] = None,
+        sample_weight=None,


Provide static type

Jacob-Stevens-Haas · 2025-12-08T02:16:59Z

pysindy/_core.py

            Names for the input features (e.g. :code:`['x', 'y', 'z']`).
            If None, will use :code:`['x0', 'x1', ...]`.

+        sample_weight : float or array-like of shape (n_samples,), optional


Why allow a single float?

Also, shouldn't this be (*n_spatial, n_time)? Or a list of those arrays, for when multiple trajectories are set?

Jacob-Stevens-Haas · 2025-12-08T02:18:10Z

pysindy/_core.py

        x_dot = concat_sample_axis(x_dot)
        self.model = Pipeline(steps)
-        self.model.fit(x, x_dot)
+        self.model.fit(x, x_dot, sample_weight=sample_weight)


Let's remove the Pipeline and directly call the components. That way, we don't need to rely on metadata routing internally

Jacob-Stevens-Haas · 2025-12-08T02:33:31Z

test/test_weights_routing.py

+
+    # --- Fit without explicit sample weights ---
+    sindy.fit(X_trajs, t=0.1, x_dot=Xdot_trajs)
+    coef_unweighted = np.copy(sindy.model.named_steps["model"].coef_)


just use sindy.optimizer.coef_

Jacob-Stevens-Haas · 2025-12-08T02:33:39Z

test/test_weights_routing.py

+    # --- Fit with sample weights to emphasize trajectory 3 (different system) ---
+    sample_weight = [np.ones(len(x_a)), np.ones(len(x_a)), 10 * np.ones(len(x_b))]
+    sindy.fit(X_trajs, t=0.1, x_dot=Xdot_trajs, sample_weight=sample_weight)
+    coef_weighted = np.copy(sindy.model.named_steps["model"].coef_)


same as above

Jacob-Stevens-Haas · 2025-12-08T02:36:19Z

test/test_weights_routing.py

+    # 3. Weighted model should bias toward system B coefficients
+    #    since trajectory B had much higher weight
+    # True systems differ by factor of 2
+    ratio = np.mean(np.abs(coef_weighted / coef_unweighted))
+    assert (
+        ratio > 1.05
+    ), "Weighted coefficients should reflect stronger influence from system B"


You don't need both an error message and description (prefer the error message). Also, make ratio > 1.5, since it should be substantially towards system B

Suggested change

# 3. Weighted model should bias toward system B coefficients

# since trajectory B had much higher weight

# True systems differ by factor of 2

ratio = np.mean(np.abs(coef_weighted / coef_unweighted))

assert (

ratio > 1.05

), "Weighted coefficients should reflect stronger influence from system B"

# True systems differ by factor of 2

ratio = np.mean(np.abs(coef_weighted / coef_unweighted))

fail_msg = "Weighted coefficients should reflect stronger influence from system B"

assert ratio > 1.5, fail_msg

Jacob-Stevens-Haas · 2025-12-08T02:37:52Z

test/test_weights_routing.py

+    # --- Assertions ---
+    # 1. Shapes are consistent
+    assert coef_weighted.shape == coef_unweighted.shape
+
+    # 2. The coefficients must differ when weighting is applied
+    assert not np.allclose(coef_weighted, coef_unweighted)


These are implied by the following tests. That's justified if these are more specific. However, shape errors will show explicitly in the below code. Additionally, not np.allcolse is a less-specific version of the next assertion

Jacob-Stevens-Haas · 2025-12-08T02:43:57Z

test/test_weights_routing.py

+    assert np.linalg.norm(coef_weighted - 2 * coef_unweighted) < np.linalg.norm(
+        coef_unweighted
+    )


I don't see how this is correct? It would make more sense to explicitly form coef_a and coef_b, then check the equations that you added as comments.

Jacob-Stevens-Haas · 2025-12-08T04:41:36Z

As a heads up, we're changing the default branch from master to main, so you'll need to change the target of the PR to main

filippozacchei added 2 commits September 24, 2025 15:21

Added sample weight to SINDy fit method.

60f520c

To do: account for WEAK sindy

Merge branch 'master' of github.com:dynamicslab/pysindy

9b7d2dd

filippozacchei force-pushed the master branch from 9ac839d to 9b7d2dd Compare October 14, 2025 16:45

Files to be modified

97c0e10

Jacob-Stevens-Haas linked an issue Oct 14, 2025 that may be closed by this pull request

Introduce sample_weights inside fit function #645

Open

Jacob-Stevens-Haas requested changes Dec 8, 2025

View reviewed changes

Sample Weights introduction #652

Are you sure you want to change the base?

Sample Weights introduction #652

Uh oh!

Conversation

filippozacchei commented Oct 13, 2025

Main Changes

_core._expand_sample_weights

weighted_weak_pde_library

Automated Tests

Uh oh!

Jacob-Stevens-Haas commented Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

filippozacchei commented Oct 17, 2025

Uh oh!

Jacob-Stevens-Haas left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Jacob-Stevens-Haas commented Dec 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

`_core._expand_sample_weights`

`weighted_weak_pde_library`

Jacob-Stevens-Haas commented Oct 14, 2025 •

edited

Loading