-
Notifications
You must be signed in to change notification settings - Fork 13
Open
Description
Hi OS community,
I've been troubleshooting my analyses for some time now--getting similar issues to those described in this issue section (e.g., "task failed on specific row and column", and "AssertionError: norm(G * v .- curr) / norm(curr) < 1.0e-6").
Now that I've found a solution to these errors, my analyses are finally running longer than before--in some cases up to 90%. But now half of them are failing with the message below, and creating a core dump. I've found some mention of this on the general Julia boards, with talk of rather complicated memory allocation issues (I couldn't make any sense of it). I'm running these analyses on a computing cluster (4 cpus, 32gb ram / cpu).
signal (11): Segmentation fault
in expression starting at none:0
Here's the full stack trace:
Stacktrace:
[1] wait
@ ./task.jl:345 [inlined]
[2] threading_run(fun::Omniscape.var"#161#threadsfor_fun#12"{Omniscape.var"#161#threadsfor_fun#11#13"{Int64, ProgressMeter.Progress, Int64, Dict{String, String}, Omniscape.ConditionLayers{Float64, 2}, Omniscape.Conditions, Omniscape.OmniscapeFlags, DataType, Dict{String, Int64}, UnitRange{Int64}}}, static::Bool)
@ Base.Threads ./threadingconstructs.jl:38
[3] macro expansion
@ ./threadingconstructs.jl:89 [inlined]
[4] run_omniscape(cfg::Dict{String, String}, resistance::Matrix{Union{Missing, Float64}}; reclass_table::Matrix{Union{Missing, Float64}}, source_strength::Matrix{Union{Missing, Float64}}, condition1::Matrix{Union{Missing, Float64}}, condition2::Matrix{Union{Missing, Float64}}, condition1_future::Matrix{Union{Missing, Float64}}, condition2_future::Matrix{Union{Missing, Float64}}, wkt::String, geotransform::Vector{Float64}, write_outputs::Bool)
@ Omniscape ~/.julia/packages/Omniscape/9gHf2/src/main.jl:257
[5] run_omniscape(path::String)
@ Omniscape ~/.julia/packages/Omniscape/9gHf2/src/main.jl:536
[6] top-level scope
@ /blue/scheffers/jbaecher/global_connectivity/julia_scripts/hpg_Asia_Europe.jl:7
nested task error:
Progress: 9%|████� | ETA: 14:22:01�[K
signal (11): Segmentation fault
in expression starting at none:0
__GI_memset at /lib64/libc.so.6 (unknown line)
cholmod_l_super_numeric at /apps/julia/1.8.2/lib/julia/libcholmod.so (unknown line)
cholmod_l_factorize_p at /apps/julia/1.8.2/lib/julia/libcholmod.so (unknown line)
cholmod_l_factorize_p at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/usr/share/julia/stdlib/v1.8/SuiteSparse/lib/x86_64-linux-gnu.jl:1116
unknown function (ip: 0x2abb91df2a00)
_jl_invoke at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/gf.c:2367 [inlined]
ijl_apply_generic at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/gf.c:2549
factorize_p! at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/usr/share/julia/stdlib/v1.8/SuiteSparse/src/cholmod.jl:616
#cholesky!#6 at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/usr/share/julia/stdlib/v1.8/SuiteSparse/src/cholmod.jl:1147
cholesky!##kw at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/usr/share/julia/stdlib/v1.8/SuiteSparse/src/cholmod.jl:1143 [inlined]
#cholesky#8 at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/usr/share/julia/stdlib/v1.8/SuiteSparse/src/cholmod.jl:1185 [inlined]
cholesky at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/usr/share/julia/stdlib/v1.8/SuiteSparse/src/cholmod.jl:1178 [inlined]
#cholesky#9 at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/usr/share/julia/stdlib/v1.8/SuiteSparse/src/cholmod.jl:1297 [inlined]
cholesky at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/usr/share/julia/stdlib/v1.8/SuiteSparse/src/cholmod.jl:1297 [inlined]
macro expansion at ./timing.jl:382 [inlined]
construct_cholesky_factor at /home/jbaecher/.julia/packages/Circuitscape/33lUW/src/core.jl:496
multiple_solve at /home/jbaecher/.julia/packages/Circuitscape/33lUW/src/raster/advanced.jl:319
multiple_solver at /home/jbaecher/.julia/packages/Circuitscape/33lUW/src/raster/advanced.jl:291
_jl_invoke at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/gf.c:2367 [inlined]
ijl_apply_generic at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/gf.c:2549
compute_omniscape_current at /home/jbaecher/.julia/packages/Circuitscape/33lUW/src/utils.jl:529
_jl_invoke at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/gf.c:2367 [inlined]
ijl_apply_generic at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/gf.c:2549
solve_target! at /home/jbaecher/.julia/packages/Omniscape/9gHf2/src/utils.jl:332
unknown function (ip: 0x2abb91dfe3b0)
_jl_invoke at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/gf.c:2367 [inlined]
ijl_apply_generic at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/gf.c:2549
macro expansion at /home/jbaecher/.julia/packages/Omniscape/9gHf2/src/main.jl:264 [inlined]
#161#threadsfor_fun#11 at ./threadingconstructs.jl:84
#161#threadsfor_fun at ./threadingconstructs.jl:51 [inlined]
#1 at ./threadingconstructs.jl:30
unknown function (ip: 0x2abb91df9b2f)
_jl_invoke at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/gf.c:2367 [inlined]
ijl_apply_generic at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/gf.c:2549
jl_apply at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/julia.h:1839 [inlined]
start_task at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/task.c:931
Allocations: 194969686 (Pool: 192768960; Big: 2200726); GC: 702
/tmp/slurmd/job61564683/slurm_script: line 24: 13146 Segmentation fault (core dumped) julia -p ${SLURM_CPUS_ON_NODE} julia_scripts/hpg_Asia_Europe.jl
Tue Apr 11 17:36:24 EDT 2023
Metadata
Metadata
Assignees
Labels
No labels