Skip to content

Clean up unrealistic synthetic household initial conditions (wealth/income outliers) #93

@agurgone

Description

@agurgone

Context

Surfaced while resolving #90 (CreditAugmentedConsumption unbounded wealth-ratio blowup). That issue is being fixed at the formula-input level: clip NLA/y, IFA/y, HA/y, and LFA/y to calibration-derived bounds every period before they feed _evaluate_target and the feasibility ceiling.

That per-period clip is a structural safety net, not a data-quality fix. It doesn't address why some synthetic households start with wealth/income ratios far outside any plausible range in the first place (e.g. seed 15, FRA, t=1: NLA/y ranging from -123 to +91, IFA/y up to 52). These look like artifacts of the HFCS-to-synthetic-population mapping (combined with population_scale=5000 inflating individual draws), not genuine heterogeneity.

Proposed direction

At initialization, detect households whose wealth/income covariates fall outside a plausible range (the same calibration-derived bounds used for the runtime clip in #90 are a natural starting candidate, or a separate microdata-anchored threshold) and replace their balance-sheet draw by resampling from the rest of the synthetic population, rather than keeping the implausible value or relying solely on the runtime clip to mask it.

Open questions

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions