Skip to content

Conversation

@RobbieKiwi
Copy link
Contributor

@RobbieKiwi RobbieKiwi commented Sep 9, 2025

Closes #450

Previous logic

For as_data_array when coords is provided:

  • If array is constant, broadcast to coords
  • If array is pandas or xarray, ignore coords and use coords of array
  • If array matches coords, then keep coords

For a more concrete example, if you provide 2 dimensional coords, this is the result of different input data:

  • 0 dimensions -> 2 dimensions
  • 1 dimension -> 1 dimension
  • 2 dimensions -> 2 dimensions

The behavior of 1 dimension -> 1 dimension is clearly the odd one out and is not very intuitive.

This behavior was noticed by people using m.create_variable which calls as_data_array under the hood

Changes proposed in this Pull Request

Add the force_broadcast option to as_data_array . When true, it will always try to broadcast to the dimensions implied by coords.
For the example above this means

  • 0 dimensions -> 2 dimensions
  • 1 dimension -> 2 dimensions
  • 2 dimensions -> 2 dimensions

Use this new option inside m.add_variable so that variable creation is more intuitive.

Note that this is a breaking change for anyone who was relying on the previous behavior when creating variables (I.e they were depending on the coords argument to be ignored). This seems like quite an edge case so perhaps it's OK to change directly.

Checklist

  • Code changes are sufficiently documented; i.e. new functions contain docstrings and further explanations may be given in doc.
  • Unit tests for new features were added (if applicable).
  • A note for the release notes doc/release_notes.rst of the upcoming release is included.
  • I consent to the release of this PR's code under the MIT license.

@FabianHofmann FabianHofmann self-requested a review September 21, 2025 20:09
@FabianHofmann
Copy link
Collaborator

FabianHofmann commented Oct 28, 2025

thanks @RobbieKiwi! I have to look into this a bit more and we need to be careful here as it could lead to unexpected behavior. and yes I am not happy with the API in terms of coords alignment, it is just too vague, we need a strict convention here. We also need to think about different operations

  • var + var
  • coeff * var
  • const + coeff * var

ideally they all follow the same convention but they are currently not. for example there is the logic that coords of a secondary term in operations like c1 * v1 + c2 * v2 are ignored if the both c1 * v1 and c2 * v2 have the same shape. On top we have the stuff of indexed and non-indexed constants / coeffs

@RobbieKiwi
Copy link
Contributor Author

Hey thanks for having a look and let me know if you have any ideas of how to progress with this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Inconsistent creation of dimensions in add_variables

2 participants