-
Notifications
You must be signed in to change notification settings - Fork 9
Environment module (Climate and AQ results from emissions) #69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
ian-ross
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think my primary comment here would be that you're breaking Rule Number 1, which is "Write programs for people, not computers" (direct quote from here). I don't want to give you a hard time about this at all, because I think you may never have been encouraged to think about things in this way before, but you should imagine some potential user of AEIC coming to this code and trying to understand what it is that you're trying to do.
You have lots of numbers taken from some sources, some of which are clearly physical constants, some of which are "soft" physical constants (e.g., radiative forcing response) whose values are arguable and which you might want to do sensitivity studies on, and some of which are values used in your model that might or might not be constants in the conventional sense, but in reality depend on the assumptions in the model. All of those different things are bundled together without any kind of explanation as to why they belong together or what they're for.
The computer doesn't care at all (or know about) the semantics of what you're doing. It doesn't care whether you give names to physical constants or cut and paste numbers from somewhere. But a human reader does care. When you write something like this, ask yourself "Could I give this to someone and have them understand the intent of what I'm doing without me needing to explain it in person?" If the answer is "No" or "Maybe, if they tried really really hard and already knew almost everything about what I'm doing", then you need to do more.
This is a different kind of problem than a software organization problem, because there simply isn't enough information in what you've written to assess whether the organization is good or not. I initially thought "What? Why are the specific heat and density of water in a configuration class? They're constants!", which led me to think that everything else in there was a constant (because there was no context or explanation). But reading further and thinking about it, it's clear that some things might be less well-established parameters and you might want to change them to see what effect they have. But I shouldn't have to puzzle those things out. They should be explained explicitly.
| @@ -0,0 +1,122 @@ | |||
| from dataclasses import dataclass | |||
| from functools import cached_property | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't normally need this. The only reason for using @cached_property instead of @property is that the calculation of your property is expensive and you don't want to repeat the calculation. By using @cached_property, you also make an implicit promise that instances of your class are immutable, or at least the attributes that go into calculating the property are. The cost of that extra thinking is such that you normally do not want @cached_property.
|
|
||
|
|
||
| @dataclass(frozen=True) | ||
| class deltaT_2box_output: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Class names in Python should be in CamelCase.
| import numpy as np | ||
|
|
||
|
|
||
| # TODO: Import this into environment config and full Config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? It's not "configuration". It's a set of constant values. Some of these are physical constants that should live in AEIC.utils.consts (which should be renamed to AEIC.utils.constants, since saving 3 characters is kind of a waste of time!). The rest are constants taken from some unspecified references, but they're still constants. Or are they? You need to explain. The computer doesn't care, but a human reader does.
| # Density of water | ||
| rho = 1000 # kg/m^3 | ||
|
|
||
| # Specific heat of Mixed Layer (70m) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does this come from? Since it's immediately following definitions of the specific heat and density of water, you might be led to expect that it should be c_water * rho * h with h being 70m. But it's not... Without explanation and references, these are just magic numbers.
| Uses iterative forward stepping through time | ||
| """ | ||
| years = len(RF) | ||
| sec_per_yr = 60 * 60 * 24 * 365.25 # seconds per year |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be a constant in AEIC.utils.units.
| """ | ||
| return self._get_alpha(self.C1) | ||
|
|
||
| @cached_property |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use @property.
| from AEIC.environment.temperature_response import deltaT_2box, deltaT_config | ||
|
|
||
|
|
||
| def test_2box_deltaT(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you describe what invariants or properties of your code this is testing? The asserts lower down where you're testing the values of properties of your "config" structure are just kind of silly: unless you did the calculations that produced the numbers you pasted into this file in a completely independent manner, what is it that you're actually testing? And similarly, the "big test" is just results copied in from somewhere else. That can be OK, if you say where those results came from and why they're a reasonable test comparison. But just dumping a bunch of numbers in a test file like this is unhelpful. The cases where this sort of test can be useful is for checking that results don't change as you refactor code (people often call these "golden tests") or for checking that you really are duplicating the results of a completely independent implementation of whatever you're doing. But in either of those cases, you need to say very explicitly what you're doing and why anyone should care.
Two other things: 1. You're testing two different things here. Split the test function. and 2. If you do do golden tests or any other data based tests, don't dump the data into your Python code like this. Store it in an external file instead. But you almost certainly shouldn't be doing that here anyway. (What functionality of your model function is tested better by having 100 timesteps than by having 2?)
| # Equilibrium Climate Sensitivity (ECS) | ||
| # The equilibrium temperature change (K or C) that results from doubling CO2 | ||
| # Typical range**: 1.5 - 4.5 K (IPCC AR5) | ||
| ECS = 4.0 # K |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So why pick this value?
| # Radiative Forcing from CO₂ Doubling | ||
| RF2xCO2 = 3.93 # W/m^2 | ||
|
|
||
| # Advective mass flux of water from boundary layer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are values assumed in the model. Where do they come from?
| # Typical range**: 1.5 - 4.5 K (IPCC AR5) | ||
| ECS = 4.0 # K | ||
| # Radiative Forcing from CO₂ Doubling | ||
| RF2xCO2 = 3.93 # W/m^2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does this come from?
ian-ross
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like very much that you have started to document what you're thinking about here.
(Tongue firmly in cheek for the following, but I hope you'll get the idea...)
I do think that you're falling into what I can't help thinking of as "software Stalinism" again though! You want to put all of the things into one big class over which you have absolute control. Instead, we need more "software Maoism": "let a million functions bloom".
At this point, I would suggest that you take one of the things you need to calculate and just implement it in the simplest possible way. I would start with radiative forcing, so that would be a function called calculate_radiative_forcing that takes a set of emission time series and produces as output time series of radiative forcing. Don't make any configuration classes or anything like that, just that one function and the things you need to be able to call it: some representation of time series, some representation of aggregated emissions, and so on.
See how little additional infrastructure you can do with, and see how clear and legible you can make what you write.
|
|
||
| ## Purpose | ||
|
|
||
| The goal of the environment module is to get detailed and all possible climate and air-quality impact metrics from the emissions output for a choice of configuration parameters, background scenarios, discount rates, etc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of the language here ("all possible", "choice of configuration parameters, background scenarios, discount rates, etc.", "all climate and AQ metrics from individual components available for different kinds of message") is too maximalist for my taste. I think you would be much better off starting with a relatively limited number of things you want to calculate, coming up with a good way to organize those calculations and adding the things that need to be in AEIC incrementally as the need arises.
This gets at what we were talking about in the meeting the other day: the functionality that AEIC might present for these kinds of calculations will be the simplest and most straightforward versions of these things. Anyone who really cares about one of these things will do more detailed modelling themself using more domain-appropriate tools. I think that that means that AEIC doesn't need to take such a "do all the things" approach to impacts calculations.
| ```python | ||
| env_config = config.environment() | ||
|
|
||
| env = AEIC.environment.EnvironmentClass(config=env_config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do all of these calculations need to be done by a single class? This looks very much like the design of the emissions module, which had the same approach (one Emissions class with lots of things in it), that I'm in the process of breaking down into a more modular view of things.
Most of the things that need to be calculated here are pure functions. You've even drawn pipelines above that emphasize that. There's really no need for these things to be parts of some overriding class, because there should be no state shared between them.
If you want another perspective on this, think about user choice. By putting everything into one big class, you are removing the possibility from the user to compose the different calculations and tools as they like, or to use them in other tools. It's your way or the highway.
| @@ -0,0 +1,305 @@ | |||
| # Environment Module | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a good name for this functionality? You're including radiative transfer calculations, climate sensitivity, climate benchmarks, economic calculations, plus pollution and air quality and mortality calculations. That seems like a lot to put under the innocuous name of "environment". Would "impacts module" be better? That's what that "I" in ACAI is for.
|
|
||
| The goal of the environment module is to get detailed and all possible climate and air-quality impact metrics from the emissions output for a choice of configuration parameters, background scenarios, discount rates, etc. | ||
|
|
||
| The aim is to get all climate and AQ metrics from individual components available for different kinds of messaging. Certain users/audiences would prefer GWP or deltaT over Net Present Value/monetized damages. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you write down some realistic use cases? And identify some of these "certain users/audiences"?
| env = AEIC.environment.EnvironmentClass(config=env_config) | ||
|
|
||
| # Radiative forcing only | ||
| rf = env.climate.emit_RF(emissions=em) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why emit_RF? That's a weird name. I can understand why you had emit in the emissions module, but why here too?
Also, looking at your EmissionsOutput class below, I assume this is not the output of the emissions module (which calculates emissions per trajectory). It's some sort of aggregated emissions time series, right? So we need the post-processing step to produce those aggregated emissions values before we get to impacts. Is that right?
If so, I guess that means that all of these impacts/environment calculations have no spatial component to them. I don't know very much about this side of climate research, but do people still do this for calculating things like mean radiative forcing or temperature changes? I know that those things are reported as global averages, but aren't the calculations usually done with spatially resolved models and the results averaged afterwards? I know even less about the air quality side of things, but there spatial variability is even more important.
| ↓ | ||
| Temperature Change (ΔT) | ||
| ↓ | ||
| Climate Metrics (GWP, TP, ATR, CO2e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pipeline picture isn't correct, is it? Some of these climate metrics aren't just a function of temperature change. (ATR is, GWP is not, I don't know what TP is, and I don't know what a CO2 equivalent metric is supposed to be here - the amount of CO2 to get a GWP equal to the GWP from the actual emissions?)
|
|
||
| The aim is to get all climate and AQ metrics from individual components available for different kinds of messaging. Certain users/audiences would prefer GWP or deltaT over Net Present Value/monetized damages. | ||
|
|
||
| While the module has all the pipeline to calculate all the metrics needed (list later in the doc), it also has the ability to switch out with external modules. For example, there is a way to quickly calculate RF from contrails using an estimate of RF/km contrail. But if the user wants they can also choose to calculate contrail impacts using PyContrails. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a genuine need for this? Would it not be better simply to provide utilities to generate PyContrails-friendly input from AEIC data? Wanting to incorporate all external tools into your own thing is usually a bad idea. It's better to organize things so that you can work nicely with external tools, rather than imposing some sort of software totalitarianism on your users.
Doing what you want to do here also introduces a direct dependency on PyContrails, which really seems out of scope for AEIC. My general feeling about this whole "environment/impacts" module is that you should be implementing a toolkit of the more basic ways of calculating the things that people might be interested in. Anyone who wants to do something more sophisticated will, well, do something more sophisticated.
|
|
||
| | Index | Component | | ||
| | ----- | --------------- | | ||
| | 1 | CO2 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick, without looking at your list, what's rf[6, :]? Using numeric codes like this is what you do in Fortran 77. Don't do it in Python. I would write this as something approximately like:
from enum import StrEnum, auto
class RFComponent(StrEnum):
CO2_BACKGROUND = auto()
CO2_LIFECYCLE = auto()
O2_SHORT = auto()
H2O_SHORT = auto()
...
CH4_LONG = auto()
class TimeSeries:
# Whatever. Could just be a Numpy array, or could be something with
# explicit time information.
...
RFComponents = dict[RFComponent, TimeSeries]
class RFOutput:
def __init__(self, components: RFComponents):
self.components = components
# Assuming TimeSeries can be added:
self.total = sum(components.values())
# Example:
self.CO2 = (
components[RFComponent.CO2_BACKGROUND] +
components[RFComponent.CO2_LIFECYCLE]
)| | `time` | Years since emission (annual resolution) | | ||
|
|
||
| **DIMENSIONS:** | ||
| All time-resolved outputs are shaped as: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By "shaped as" you mean they're Numpy arrays, don't you? Try to get out of the habit of thinking of every piece of composite data as a Numpy array. In this case, the data you're representing is not a matrix, it's a set of parallel time series. There's some more about this in the comment right above.
| Making a spreadsheet with detailed info on constants, configs etc here: | ||
| https://docs.google.com/spreadsheets/d/1zDOw2smQkYmstu-6Txfnw04_3jfwQ2vSCGcHdnl46XA/edit?usp=sharing | ||
|
|
||
| ### MonetizationConfig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I don't think this stuff needs to be in a configuration object. These things can just be parameters to functions, with default values. They're then there right in the place where they're used, so when a user is looking at the documentation about discounting, they see the functions related to discounting and the default discount rate right there in the documentation for the function, rather than in a separate piece of documentation about a configuration class.
Working on adding environment module to AEIC - essentially what APMT does (and ACAI as well)
TO DO for Environment Classs
TO DO for Climate
Two box temperature response modelTO DO for AQ