Skip to content

Provincial IO table: NaN propagation from sparse sectors causes AssertionError #98

Description

@reetik-sahu

Summary

Running the model with can_disaggregation=True (Canadian provincial IO table) produces eleven
distinct assertion/value errors caused by NaN values propagating from sparse provincial sectors. This mainly only when the substitution is switched on when using the Bundled Leontif production function. The fixes listed below dont

Root source: get_intermediate_inputs_matrix in icio_reader.py computes
total_output / total_intermediate_inputs for each sector. Provincial sectors with no
activity have total_output = 0 and total_intermediate_inputs = 0, producing 0/0 = NaN
Leontief coefficients. These NaN values flow through unit costs → prices → every downstream
computation.

Propagation chain:

get_intermediate_inputs_matrix (0/0 = NaN)
    └─ unit_costs → compute_price → price → price_in_usd
           ├─ Issue 12: compute_price receives estimated_ppi_inflation = NaN at t=0
           │       └─ ALL firm prices → NaN → GDP = NaN for entire simulation
           ├─ Issue 2:  compute_offered_price  AssertionError
           ├─ Issue 5:  prepare_selling_goods → set_prices → NaN nominal supply
           │       └─ Issue 6:  perform_clearing → emp_goods_prices still NaN
           │               └─ Issue 4:  fill_buckets  ValueError
           ├─ Issue 3:  production_price_index / aggregate_nominal_production → NaN RoW imports
           │       └─ AssertionError: desired_imports_in_usd >= 0
           └─ Issues 7-11:  household/government sklearn models receive NaN features
                   └─ ValueError: Input X contains NaN

Note: The root fix in icio_reader.py requires regenerating the pickle file. Additionally,
estimated_ppi_inflation is initialised to [np.nan] in economy_ts.py, so all runtime
guards below are also needed for the very first timestep regardless of pickle state.


Issue 1 — Root Source: get_intermediate_inputs_matrix

File: macro_data/readers/io_tables/icio_reader.py

Error: Silent NaN injection into every downstream price/cost computation.

Problem: Sparse provincial sectors (zero output, zero inputs) produce 0/0 = NaN:

result = total_output[None, :] / total_monthly_intermediate_inputs
return result  # NaN for sparse sectors

get_capital_inputs_matrix in the same file already has .fillna(np.inf) — this fix
makes get_intermediate_inputs_matrix consistent with it.

Fix:

result = total_output[None, :] / total_monthly_intermediate_inputs
return result.fillna(np.inf)

inf is semantically correct: infinite efficiency means the sector needs 0 intermediate inputs,
so production / inf = 0.


Issue 2 — AssertionError: assert np.all(avg_price > 0.0) in compute_offered_price

File: macromodel/agents/firms/firms.py

Traceback site: assert np.all(avg_price > 0.0)

Problem: NaN price_in_usd with tiny-but-nonzero production passes through the safe-divide
guard (where=real != 0.0), because NaN != 0.0 = True — the division executes and returns NaN.
Then NaN == 0.0 = False, so the zero-fallback never triggers.

Fix:

avg_price = np.divide(nom, real, out=np.zeros(nom.shape), where=real != 0.0)
avg_price = np.nan_to_num(avg_price, nan=0.0, posinf=0.0, neginf=0.0)
avg_price[avg_price == 0.0] = self.ts.current("price_offered")[avg_price == 0.0]
avg_price[avg_price == 0.0] = 1.0  # final fallback for sparse sectors with no history
assert np.all(avg_price > 0.0)

Issue 3 — AssertionError: assert np.all(desired_imports_in_usd >= 0.0) in RoW

File: macromodel/simulation.py

Traceback site: rest_of_the_world.py:292

Problem: production_price_index uses plain np.sum — a single NaN country makes the
index NaN. aggregate_nominal_production has the same issue. Then
np.maximum(0.0, NaN) = NaN in InflationRoWImportsSetter.compute_imports, so desired
imports become NaN and fail the >= 0 assertion.

Fix — aggregate_nominal_production:

return np.nansum([
    np.nansum(
        firms.ts.current("price") / firms.ts.initial("price")
        * (firms.ts.current("production") + firms.ts.prev("inventory"))
    )
    for firms in ...
])

Fix — production_price_index:

@property
def production_price_index(self) -> float:
    current = np.nansum([firms.ts.current("production").sum() for ...])
    initial = np.nansum([firms.ts.initial("production").sum() for ...])
    return float(current / initial) if initial > 0 else 1.0

Issue 4 — ValueError: Nan in transactor_total_real_supply in fill_buckets

File: macromodel/markets/goods_market/func/lib_water_bucket.py

Traceback site: line ~773

Two sub-problems:

(a) NaN fill_amount

fill_amount = total_real_demand[g] / aggr_real_demand[g] is 0/0 = NaN for sparse
industries with no demand. fill_amount == 0.0 is False for NaN, so the early-exit guard
never fires.

Fix: Add or not np.isfinite(fill_amount) to the guard:

if np.sum(capacities) == 0 or fill_amount == 0.0 or not np.isfinite(fill_amount):
    return np.zeros_like(capacities)

(b) Broken NaN-capacity check

The original guard is:

if np.sum(capacities) == np.sum(capacities) + 1:   # always False when sum is NaN

NaN == NaN is always False in IEEE 754, so the guard never fires. NaN capacities (from
budget / NaN_price) pass into the allocation logic. When minimum_fill > 0, they
propagate via capacities / sum(capacities) * minimum_fill * fill_amount = c / NaN = NaN.

Fix: Replace the broken guard with nan_to_num at function entry:

capacities = np.nan_to_num(capacities, nan=0.0)  # NaN buyer capacity → 0
if np.sum(capacities) == 0 or fill_amount == 0.0 or not np.isfinite(fill_amount):
    return np.zeros_like(capacities)

Issue 5 — ValueError: Nan in average_goods_price or real_amount_bought (first occurrence)

File: macromodel/agents/firms/firms.py

Traceback site: lib_water_bucket.py:812

Problem: price_in_usd = 1/exchange_rate * price. When price is NaN (because
estimated_ppi_inflation = [np.nan] at init), price_in_usd = NaN. This is passed directly
to set_prices, storing NaN in seller states ["Prices"]. collect_seller_info then
computes quantity * NaN = NaN nominal supply → NaN average price.

Fix (in prepare_selling_goods):

self.set_prices(
    np.nan_to_num(
        self.ts.current("price_in_usd"),
        nan=1.0 / self.exchange_rate_usd_to_lcu,
    )
)

Issue 6 — ValueError: Nan in average_goods_price or real_amount_bought (second occurrence)

File: macromodel/markets/goods_market/func/clearing.py

Traceback site: lib_water_bucket.py:812

Problem: perform_clearing fills missing emp_goods_prices via the country/global average
chain. If average_prices_by_country[-1] (the global fallback) is also NaN for a sparse
sector — because all sellers in that industry have stale NaN prices — emp_goods_prices
remains NaN after all fallbacks.

Fix: Add a final guard after the existing fallback chain:

# If all fallbacks failed use 1.0: aggr_real_supply[g] = 0 → fill_amount = 0 → no trade.
emp_goods_prices[np.isnan(emp_goods_prices) | (emp_goods_prices == 0.0)] = 1.0

Issue 7 — ValueError: Input X contains NaN in DefaultWealthSetter.distribute_new_wealth

File: macromodel/agents/households/func/wealth.py

Traceback site: sklearn.linear_model._base.LinearModel.predict (called from wealth.py:309)

Problem: distribute_new_wealth stacks ts.current("income") and ts.current("debt")
into feature matrix x. NaN != 0.0 = True causes the non_zero column mask to select NaN
columns, so x /= NaN_sum produces NaN features. sklearn's LinearRegression.predict rejects
NaN input.

Fix:

x = np.nan_to_num(x, nan=0.0)   # sparse households have NaN income/debt → treat as 0
non_zero = x.sum(axis=0) != 0.0
x[:, non_zero] /= x.sum(axis=0)[non_zero]
pred_deposit_fraction = model.predict(x)

Issue 8 — ValueError: Input X contains NaN in DefaultSocialTransfersSetter.get_social_transfers

File: macromodel/agents/households/func/social_transfers.py

Traceback site: sklearn LinearRegression.predict

Problem: Same structural bug as Issue 7 — current_independents /= current_independents.sum(axis=0)
does not guard against NaN columns (no non_zero check at all), and the subsequent
pred_transfers /= np.sum(pred_transfers) would produce NaN if all predictions are zero.

Fix:

current_independents = np.nan_to_num(current_independents, nan=0.0)
col_sums = current_independents.sum(axis=0)
non_zero = col_sums != 0.0
current_independents[:, non_zero] /= col_sums[non_zero]
pred_transfers = model.predict(current_independents)
pred_transfers[pred_transfers < 0] = 0.0
total_pred = np.sum(pred_transfers)
if total_pred > 0.0:
    pred_transfers /= total_pred
else:
    pred_transfers = np.full_like(pred_transfers, 1.0 / n_households)

The same fix was applied to ConstantSocialTransfersSetter which uses initial_independents
with the same unguarded normalization pattern.


Issue 9 — ValueError: Input X contains NaN in DefaultSavingRatesSetter.get_saving_rates

File: macromodel/agents/households/func/saving_rates.py

Traceback site: sklearn LinearRegression.predict

Problem: Same structural bug — current_independents /= current_independents.sum(axis=0)
with no NaN or zero-sum guard.

Fix:

current_independents = np.nan_to_num(current_independents, nan=0.0)
col_sums = current_independents.sum(axis=0)
non_zero = col_sums != 0.0
current_independents[:, non_zero] /= col_sums[non_zero]
pred_sr = model.predict(current_independents)

The same fix was applied to ConstantSavingRatesSetter.


Issue 10 — ValueError: Input X contains NaN in DefaultSocialBenefitsSetter

File: macromodel/agents/central_government/func/social_benefits.py

Traceback site: sklearn LinearRegression.predict

Problem: compute_unemployment_benefits and compute_regular_transfer_to_households
both call model.predict(np.array([[historic_ppi_inflation[-1], current_unemployment_rate]])).
historic_ppi_inflation[-1] is NaN at the first timestep (initialised as [np.nan] in
economy_ts.py).

Fix:

ppi = float(np.nan_to_num(historic_ppi_inflation[-1], nan=0.0))
pred = model.predict(np.array([[ppi, current_unemployment_rate]]))[0]

Issue 11 — ValueError: Input X contains NaN in ExchangeRates model mode

File: macromodel/exchange_rates/exchange_rates.py

Traceback site: sklearn LinearRegression.predict

Problem: When exchange_rate_type = "model", the call is:

model.predict(np.array([prev_inflation, prev_growth]))

Two problems: (1) prev_inflation or prev_growth may be NaN if the production price index
or GDP growth is NaN; (2) the array shape is (2,) — sklearn expects (n_samples, n_features)
= (1, 2).

Fix:

features = np.nan_to_num(np.array([[prev_inflation, prev_growth]]), nan=0.0)
return self.exchange_rates_model[data_country].predict(features)

Issue 12 — All GDP outputs are NaN: estimated_ppi_inflation = NaN in compute_price

File: macromodel/agents/firms/func/prices.py

Symptom: Every GDP time-series value is NaN from the very first timestep, even when the
model runs without assertion errors.

Problem: DefaultPriceSetter.compute_price multiplies the base price by
(1 + gf * current_estimated_ppi_inflation). At t=0, estimated_ppi_inflation is
initialised to [np.nan] in economy_ts.py, so this factor is NaN for every firm in
every province. The floor np.maximum(1e-2, NaN) = NaN also fails to apply, leaving all
prices NaN. All GDP components — total_output = (price * production).sum(),
total_sales, gross_operating_surplus, wages — inherit this NaN and propagate it into
compute_gdp, producing NaN for the entire simulation.

A secondary path: average_price_by_firm (from the previous period's clearing) can also
be NaN for sparse sectors. Because NaN != 0.0 = True, the safe-divide guard in
cost_push_inflation executes the division (curr_unit_costs / NaN = NaN), and
np.maximum(min_inflation, NaN) = NaN.

Fix:

# Guard against NaN average prices leaking into cost_push_inflation
cost_push_inflation = (
    np.divide(
        curr_unit_costs,
        average_price_by_firm,
        out=np.ones_like(curr_unit_costs),
        where=np.isfinite(average_price_by_firm) & (average_price_by_firm != 0.0),
    )
    - 1.0
)
cost_push_inflation = np.nan_to_num(cost_push_inflation, nan=0.0)
cost_push_inflation = np.maximum(min_inflation, np.minimum(max_inflation, cost_push_inflation))

# Treat NaN estimated_ppi_inflation as 0 (no global inflation adjustment for this step)
safe_ppi = float(np.nan_to_num(current_estimated_ppi_inflation, nan=0.0))

return np.maximum(
    1e-2,
    prev_prices
    * (1 + np.random.normal(...))
    * (1 + self.price_setting_speed_gf * safe_ppi)   # was: current_estimated_ppi_inflation
    * (1 + self.price_setting_speed_dp * demand_pull_inflation)
    * (1 + self.price_setting_speed_cp * cost_push_inflation),
)

SectorExogenousPriceSetter calls super().compute_price(...) and inherits the fix
automatically.


Issue 13 — GDP is NaN: compute_average_price and compute_gdp are not NaN-safe

Files: macromodel/economy/economy.py, macromodel/country/country.py

Symptom: ts_gdp is NaN for all provinces across the entire simulation.

Problem — compute_average_price (economy.py):
The zero-trade fallback uses == 0.0 to detect that no transactions occurred:

if (real_sum == 0.0) or (nominal_sum == 0.0):
    return self.ts.current("good_prices")[industry]   # safe fallback

NaN == 0.0 = False, so any residual NaN in real_amount_bought or nominal_amount_spent
(from sparse sectors) bypasses the check and computes NaN / NaN = NaN. This NaN is stored
in good_prices, which then flows into compute_used_intermediate_inputs_costs via
matmul(used_inputs, good_prices) → NaN in sectoral_intermediate_consumption → NaN GDP.

Fix — compute_average_price: Switch to np.nansum so residual NaN transaction records
from sparse sectors are treated as zero, then guard with == 0.0:

total_real    = np.nansum(firm_real) + np.nansum(hh_real) + np.nansum(gov_real)
total_nominal = np.nansum(firm_nom)  + np.nansum(hh_nom)  + np.nansum(gov_nom)
if total_real == 0.0 or total_nominal == 0.0:
    return self.ts.current("good_prices")[industry]   # no trade → keep previous price
return total_nominal / total_real

Problem — compute_gdp call site (country.py):
All GDP components use plain .sum() on firm time series
(total_sales, used_intermediate_inputs_costs, gross_operating_surplus_mixed_income,
total_wage, total_inventory_change, investment, etc.). A single NaN firm contribution
(sparse sector) makes the aggregate NaN. np.bincount with NaN weights also propagates NaN.

Fix — GDP call site: Replace every .sum() with np.nansum() and every np.bincount(..., weights=x) with np.bincount(..., weights=np.nan_to_num(x, nan=0.0)):

total_output=float(np.nansum(price * production)),
sectoral_sales=np.bincount(Industry, weights=np.nan_to_num(total_sales, nan=0.0), ...),
sectoral_intermediate_consumption=np.bincount(Industry, weights=np.nan_to_num(used_ii_costs, nan=0.0), ...),
change_in_inventories=float(np.nansum(inventory_change) + np.nansum(ii_bought) - np.nansum(ii_used)),
gross_fixed_capital_formation=float(np.nansum(capital_bought) + (1+tau_cf)*float(np.nansum(investment))),
operating_surplus=float(np.nansum(gross_operating_surplus)),
wages=float(np.nansum(total_wage)),
...

Reproducer

A self-contained test file is available at tests/test_provincial_sparse_sector_nans.py.
It contains BUG/FIX test pairs that reproduce the core errors (Issues 1–6) with pure NumPy
arrays — no model pickle required.

pytest tests/test_provincial_sparse_sector_nans.py -v

Affected Files

File Change
macro_data/readers/io_tables/icio_reader.py get_intermediate_inputs_matrix: add .fillna(np.inf)
macromodel/agents/firms/func/prices.py DefaultPriceSetter.compute_price: nan_to_num(estimated_ppi_inflation) + isfinite guard on cost_push_inflation
macromodel/agents/firms/firms.py compute_offered_price: nan_to_num + = 1.0 fallback
macromodel/agents/firms/firms.py prepare_selling_goods: nan_to_num(price_in_usd) before set_prices
macromodel/simulation.py aggregate_nominal_production + production_price_index: np.nansum
macromodel/markets/goods_market/func/lib_water_bucket.py fill_buckets: nan_to_num(capacities) + isfinite guard
macromodel/markets/goods_market/func/clearing.py perform_clearing: final emp_goods_prices guard → 1.0
macromodel/agents/households/func/wealth.py distribute_new_wealth: nan_to_num(x) before sklearn predict
macromodel/agents/households/func/social_transfers.py Default/ConstantSocialTransfersSetter: nan_to_num + non_zero guard + zero-sum fallback
macromodel/agents/households/func/saving_rates.py Default/ConstantSavingRatesSetter: nan_to_num + non_zero guard
macromodel/agents/central_government/func/social_benefits.py compute_unemployment_benefits + compute_regular_transfer_to_households: nan_to_num(ppi)
macromodel/exchange_rates/exchange_rates.py get_exchange_rate model branch: nan_to_num + reshape to (1, 2)
macromodel/economy/economy.py compute_average_price: np.nansum so NaN transaction records don't bypass the zero-trade fallback
macromodel/country/country.py compute_gdp call site: np.nansum + nan_to_num on every .sum() and bincount weight

icio_2014_can_provinces.csv

test_provincial_sparse_sector_nans.py

macromodel-carbon-policy-scenarios-provincial.ipynb

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions