Add grouped_stats for categorical or continuous binning#815
Conversation
a9e1c81 to
b851c48
Compare
grouped_stats for categorical or continuous binning
|
0a06adf to
0a32be7
Compare
8990a09 to
a80da84
Compare
dac70cc to
32594a4
Compare
|
@adebardo Thanks for all this work! Nice that it's taking shape, I think it's getting close, and we should be able to add zonal stats shortly after 🙂. (I'm actually working on separate code that should allow to add Dask/Multiprocessing support fairly easily there). On the technical aspects of the implementation: To be honest, I had a lot of trouble understanding them. I hope I didn't miss anything obvious 😅 |
fb75a3e to
51aecef
Compare
64a4858 to
7458b53
Compare
7323009 to
5fe1718
Compare
abc131f to
55ed6bb
Compare
55ed6bb to
c5d2a85
Compare
| Using GeoUtils functions makes it very easy to visualise them. | ||
|
|
||
| ```{code-cell} ipython3 | ||
| group_by = {"raster": rast.data} |
There was a problem hiding this comment.
Why there is no file linked to rast ?
There was a problem hiding this comment.
Because we cover only NDArrayNum here, do you want to be more inclusive ?
There was a problem hiding this comment.
It is in case the user wants to reproduce the example. It is not possible with this example.
There was a problem hiding this comment.
I'm not sure to understand the problem, but if you look at the beginning of stats.md file you'll see a file attached to raster
Resolves #774
Context
The purpose of this PR is to offer users a simplified API for implementing grouped statistics for their raster.
To do this, we use pandas' capabilities to work on 1D arrays and their associated classification.
We implemented the function directly in the base.py file as well as the TUs.
Code
Tests
For testing purposes, we provide a fake raster to ensure that field truths can be calculated by hand.
We have implemented one test per bin type.
We also introduce panda dataframe equality into this file and the necessary tests.
Documentation
https://adebardo-geoutils.readthedocs.io/en/774-grouped-stats/stats.html
We have added documentation to the statistics tab. Currently, tests are being carried out on raster altitude assumptions against the same raster, as well as the binarisation of the glacier mask. We have opened a ticket to propose data additions.