Skip to content

eCLM-PDAF crashes after some time for multivariate assimilation with PDAF-OMI #41

@s7yoewer

Description

@s7yoewer

The model crashes after some time for each simulation (something around 4,5 years). When restarting the model after 4 years with the given restart files, it works fine again until sth around 4.5 years. Sometimes it is a little bit longer and sometimes a little shorter but this is roughly the duration. The .err file only gives
forrtl: error (78): process killed (SIGTERM)
Image PC Routine Line Source
libc.so.6 000014ED66A45BF0 Unknown Unknown Unknown
libpscom.so.2.0.0 000014ED65D2C5A7 pscom_progress Unknown Unknown
libpscom.so.2.0.0 000014ED65D2CA03 pscom_wait_any Unknown Unknown
libmpi.so.12.3.1 000014ED68799C03 Unknown Unknown Unknown
libmpi.so.12.3.1 000014ED68754A61 Unknown Unknown Unknown
libmpi.so.12.3.1 000014ED68725A44 Unknown Unknown Unknown
libmpi.so.12.3.1 000014ED68725E93 Unknown Unknown Unknown
libmpi.so.12.3.1 000014ED68699ED5 Unknown Unknown Unknown
libmpi.so.12.3.1 000014ED6870F4CD Unknown Unknown Unknown
libmpi.so.12.3.1 000014ED6870F654 Unknown Unknown Unknown
libmpi.so.12.3.1 000014ED6870F874 Unknown Unknown Unknown
libmpi.so.12.3.1 000014ED686992CE Unknown Unknown Unknown
libmpi.so.12.3.1 000014ED6870E584 Unknown Unknown Unknown
libmpi.so.12.3.1 000014ED6870E645 Unknown Unknown Unknown
libmpi.so.12.3.1 000014ED6870E7A4 Unknown Unknown Unknown
libmpi.so.12.3.1 000014ED685E9423 PMPI_Barrier Unknown Unknown
libmpifort.so.12. 000014ED6715898C pmpi_barrier Unknown Unknown
tsmp-pdaf 0000000000423970 Unknown Unknown Unknown
tsmp-pdaf 0000000000419D6D Unknown Unknown Unknown
libc.so.6 000014ED66A305D0 Unknown Unknown Unknown
libc.so.6 000014ED66A30680 __libc_start_main Unknown Unknown
tsmp-pdaf 0000000000419C85 Unknown Unknown Unknown

so I am not sure why this happens. @jjokella do you have an idea and have already seen sth similar. Given the restart error in eCLM, I can only use the first years of my simulation to compare to some OL. This does not happen when no observations are assimilated (for an OL run), so I subspect some memory issues (?).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions