-
Notifications
You must be signed in to change notification settings - Fork 297
Refactor shape masking #6129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Refactor shape masking #6129
Conversation
|
From @SciTools/peloton : discussed what we are seeing here and in #6126. |
|
Discussion with @acchamber notes:
|
|
What are @SciTools/peloton (@pp-mo) thoughts about adding This would facilitate better handling of shapes to rasters (which is essentially the problem we're solving) with the added bonus that we could also mask to other shape types, like lines, to extract a trajectory from a Cube, for example. |
I was planning on going to the AVD surgery this week to discuss some changes and/or make the case for rasterio |
|
Thanks for your patience everyone. It's sometimes a struggle to balance everything and we just haven't had an opportunity to discuss this. Coming to the Surgery (UK Met Office) is an ideal next step. |
|
@pp-mo, @trexfeathers and @stephenworsley have discussed this at the Surgery and agree in principle that we could consider adding rasterio as an optional dependency. |
for more information, see https://pre-commit.ci
|
From @SciTools/peloton: would @hsteptoe and @acchamber appreciate any more input from core devs at this point? |
I just need to find more time, which is in short supply as we get near the end of FY... |
|
This is now ready for some initial (alpha?) testing 🎉. @acchamber are you available to try out the new features? (Updates to docs etc. will follow if this is successful) Tests are incomplete are should not be review yet. This is a fairly substantial re-write. New/improved features include:
There are some breaking changes, principally moving away from having a My informal tests so far show working behaviour for iris test data from |
pre-commit related errors may have been something wrong going on with my pre-commit cache. Did a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still left to review:
-
_shapefiles.py - tests
- lock files
| A number between 0-1 describing what % of a cube cell area must the shape overlap to be masked. | ||
| Only applied to polygon shapes. If the shape is a line or point then this is ignored. | ||
| Other Parameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did you decide which ones were kwargs and which ones were named parameters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decided that kwargs would be parameters that are specifically used by rasterio functions only, but I'm happy to expose them if that's clearer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that's true of invert. And you are doing a little handling of all_touched too. It's not like the 10s of Matplotlib arguments which iris.plot passes straight down; I think in this case it would be clearer to have everything as normal Parameters.
Today I learned that Sphinx sticks Parameters and Other Parameters together when rendering 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still left to review:
- tests
- lock files
- confirm stable behaviour given the heavy refactor
| # to "OFF" in the environment. Having PROJ_NETWORK = "ON" | ||
| # can lead to some coordinate transformations resulting in Inf values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What a pointless feature! I'd be interested to know why.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be safe, please could we turn this off at the beginning of create_shape_mask(), then reactivate at the end? I don't know if other users might be relying on it so don't want to a make a permanent change.
| # Define raster transform based on cube | ||
| # This maps the geometry domain onto the cube domain | ||
| tr = _make_raster_cube_transform(cube) | ||
| # Generate mask from geometry | ||
| mask_template = rfeatures.geometry_mask( | ||
| geometries=shapely.get_parts(geometry), | ||
| out_shape=(h, w), | ||
| transform=tr, | ||
| all_touched=all_touched, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you be up for giving me a 5 minute lesson about this?
Co-authored-by: Martin Yeo <[email protected]>
Co-authored-by: Martin Yeo <[email protected]>
Co-authored-by: Martin Yeo <[email protected]>
…bugfix-shape-masking
for more information, see https://pre-commit.ci
…bugfix-shape-masking
…bugfix-shape-masking
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #6129 +/- ##
==========================================
+ Coverage 90.29% 90.32% +0.02%
==========================================
Files 91 91
Lines 24656 24696 +40
Branches 4618 4626 +8
==========================================
+ Hits 22264 22307 +43
+ Misses 1620 1613 -7
- Partials 772 776 +4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
⏱️ Performance Benchmark Report: b6002b8Performance shiftsFull benchmark resultsGenerated by GHA run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
End-of-review! Thanks again @hsteptoe
| -------- | ||
| For best masking results, both the cube _and_ masking geometry should have a | ||
| coordinate reference system (CRS) defined. Note that CRS of the masking geometry | ||
| must be provided explicitly to this function (via ``geometry_crs``), whereas the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| must be provided explicitly to this function (via ``geometry_crs``), whereas the | |
| must be provided explicitly to this function (via ``shape_crs``), whereas the |
| If a CRS is _not_ provided for the the masking geometry, the CRS of the cube is assumed. | ||
| This function requires additional dependencies: ``rasterio`` and ``affine``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This warning should also be against mask_cube_from_shapefile(), and we should probably warn existing users in the existing What's New as well - their experience will change.
I didn't think of this side of things before; obliging people to adopt new dependencies is a pretty big breakage to include in a minor release. At this late stage I'm inclined to roll with it and we can back it out if it becomes a massive issue (unlikely).
| x_coord = cube.coord(axis="X", dim_coords=True) | ||
|
|
||
| # Check for CRS equality and transform if necessary | ||
| cube_crs = cube.coord_system().as_cartopy_projection() # type: ignore[union-attr] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two things:
- You don't need to call
cube.coord_system()again here - you already assigned that on line 147. - One of the
cube_crsvariables should be renamed, otherwisecube_crshas two different types depending on which line you are running.
These things will fix the MyPy error, which will make Iris more safely typed, making future maintenance simpler.
| raise ValueError(err_msg) | ||
|
|
||
| # Get cube coordinates | ||
| x_coord, y_coord = [cube.coord(axis=a, dim_coords=True) for a in ("X", "Y")] # type: ignore[arg-type] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| x_coord, y_coord = [cube.coord(axis=a, dim_coords=True) for a in ("X", "Y")] # type: ignore[arg-type] | |
| axes: tuple[Axis, Axis] = ("X", "Y") | |
| x_coord, y_coord = [cube.coord(axis=a, dim_coords=True) for a in axes] |
And at the top of the file:
if typing.TYPE_CHECKING:
from iris.util import Axis| cube = cube.intersection(iris.coords.CoordExtent(x_coord.name(), -180, 180)) | ||
| # Get revised x coordinate | ||
| x_coord = cube.coord(axis="X", dim_coords=True) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| assert isinstance(x_coord, DimCoord) | |
| assert isinstance(y_coord, DimCoord) | |
I think this clears out the remaining type hinting. There's probably a way to better label cube.coord() but not something for right now.
| wgs84 = CRS.from_epsg(4326) # WGS84 coordinate system | ||
| osgb = CRS.from_epsg(27700) # OSGB coordinate system |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Get these into a fixture (as mentioned in integration/test_mask_cube_from_shape.py)
| Polygon( | ||
| [ | ||
| (686600.5247600826, 18834.835866007765), | ||
| (622998.2965261642, 1130592.5248690117), | ||
| (-45450.06302316813, 1150844.967615187), | ||
| (-172954.59474739246, 41898.60193228102), | ||
| (686600.5247600826, 18834.835866007765), | ||
| ] | ||
| ), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mark this as Known Good Output to protect our future developers from confusion!
| masked_cube = mask_cube_from_shape( | ||
| mock_cube, shape, shape_crs=CRS.from_epsg(4326), in_place=True | ||
| ) | ||
| assert masked_cube is None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should also assert that the change has taken effect on mock_cube.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know you've achieved test coverage elsewhere but it's important to have some coverage at this level - we're free to make any changes to _shapefiles so long as the top layer behaviour remains the same; coverage at the top level helps ensure that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comments on unit/util/test_mask_cube_from_shape.py
🚀 Pull Request
Description
Fixes #6126 by exposing the shape geometry to the user. Adds to docs and docstrings to make it clearer how to use this.
Consult Iris pull request check list
Add any of the below labels to trigger actions on this PR: