Skip to content

extract data from mat file, reproject the facies and plot the virtual cores#4

Open
xyl96 wants to merge 17 commits into
mainfrom
issue_3_strata_comparison
Open

extract data from mat file, reproject the facies and plot the virtual cores#4
xyl96 wants to merge 17 commits into
mainfrom
issue_3_strata_comparison

Conversation

@xyl96

@xyl96 xyl96 commented Dec 16, 2025

Copy link
Copy Markdown
Contributor

Extracting the data from MAT file from STACKER, reproject the facies and plotting.

@xyl96

xyl96 commented Dec 16, 2025

Copy link
Copy Markdown
Contributor Author

Here is some discrepancies:
The figure I have obtained is listed below.
STACKER_cores
While the three cores (mid three) directly from STACKER is listed below here:
cross_sinSL
I do not why this is happenning.

@xyl96

xyl96 commented Dec 17, 2025

Copy link
Copy Markdown
Contributor Author

Here is some discrepancies: The figure I have obtained is listed below. STACKER_cores While the three cores (mid three) directly from STACKER is listed below here: cross_sinSL I do not why this is happenning.

I did something erong in the code! I will check it today!

@xyl96

xyl96 commented Dec 17, 2025

Copy link
Copy Markdown
Contributor Author

The plotting issue has been solved

@EmiliaJarochowska

Copy link
Copy Markdown
Member

As a general feedback on this PR: it is not clear what the problem is. From the issue description #3 it seems to me that plotting the columns is the problem, but from the PR title it seems that extracting data from mat files is the problem. Which one is it?

If it is the plotting that is needed, is there anything about these cores that requires writing completely new code for plotting columns? There are a bunch of packages for doing that, including our home one https://mindthegap-erc.github.io/stratcols/ by @NiklasHohmann

So my general feedback:

  1. Specify what is the feature you are developing (extracting, plotting, comparing?) and make sure the PR matches the problem declared
  2. The thicknesses of the cores don't match the figure from Stacker (e.g. there is no core with a thickness of 0.6 m)
  3. Store data in data not in src
  4. Would using stratcols resolve the problem (but see 1 - what is the problem exactly?)

Comment thread src/PlottingCores/plottingcores.jl
Comment thread src/PlottingCores/plottingcores.jl Outdated
for idx in eachindex(core_layers_norm)[2:end]
top = core_layers_norm[idx]
bottom = core_layers_norm[idx-1]
if core_facies[idx] != 100

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why assign properties to hiatus (color and transparency) if it's never plotted?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it specifically to check whether the codes are following what I want - it happens when I was writing the code.

Comment thread src/PlottingCores/plottingrealcores.jl
@xyl96 xyl96 changed the title extract data from mat file extract data from mat file, reproject the facies and plot the virtual cores Dec 18, 2025
@xyl96

xyl96 commented Dec 18, 2025

Copy link
Copy Markdown
Contributor Author

As a general feedback on this PR: it is not clear what the problem is. From the issue description #3 it seems to me that plotting the columns is the problem, but from the PR title it seems that extracting data from mat files is the problem. Which one is it?

If it is the plotting that is needed, is there anything about these cores that requires writing completely new code for plotting columns? There are a bunch of packages for doing that, including our home one https://mindthegap-erc.github.io/stratcols/ by @NiklasHohmann

So my general feedback:

  1. Specify what is the feature you are developing (extracting, plotting, comparing?) and make sure the PR matches the problem declared
  2. The thicknesses of the cores don't match the figure from Stacker (e.g. there is no core with a thickness of 0.6 m)
  3. Store data in data not in src
  4. Would using stratcols resolve the problem (but see 1 - what is the problem exactly?)

reply:

  1. The initiative and aims have been updated. the aim is 3-fold, including extracting, reprojection of facies and plotting. Comparison is on the way not done yet. Should be in another issue. So it is not pure 'plotting'.
  2. That's the thing I am thinking. I checked the original STACKER code yesterday and I think I followed the way he did.
  3. no data was stored in src - they were stored in results. Except the results from STACKER. What should I do with the MAT file?

Comment thread data/Abacos.xlsx

@EmiliaJarochowska EmiliaJarochowska left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to this PR specifically but more generally: please add README (don't hesitate to ask for help)

I think the main problem is storing all the mat files in the repo. Binary files cannot be tracked with git and they take a lot of space. It would probably make more sense to put them in Zenodo (separate repo) and download using datahugger? But you can ask @NiklasHohmann for a second opinion

In rastering.jl I get:

ERROR: KeyError: key "trackID" not found

Consider not tracking files that are generated in the code, just to keep the repo lean

Comment thread src/PlottingCores/plottingcores.jl Outdated
Comment thread src/CalculateMigration/migration.jl Outdated
Comment thread src/CalculateMigration/migration_all.jl
Comment thread src/CalculateMigration/migration_all.jl Outdated
Comment thread src/CalculateMigration/migration_all.jl
Comment thread src/plot_patch_stats/plotting.jl
Comment thread src/RasterizeGeoJSON/rejectionsampling.jl Outdated
Comment thread src/RasterizeGeoJSON/rejectionsampling.jl Outdated
Comment thread src/RasterizeGeoJSON/rejectionsampling.jl Outdated
Comment thread .gitignore
@NiklasHohmann

Copy link
Copy Markdown

Not related to this PR specifically but more generally: please add README (don't hesitate to ask for help)

I think the main problem is storing all the mat files in the repo. Binary files cannot be tracked with git and they take a lot of space. It would probably make more sense to put them in Zenodo (separate repo) and download using datahugger? But you can ask @NiklasHohmann for a second opinion

In rastering.jl I get:

ERROR: KeyError: key "trackID" not found

Consider not tracking files that are generated in the code, just to keep the repo lean

Generally agree, keeping binary data on Zenodo and then autodownload it would be the smoothest option.
Another option would be to automate the whole stacker workflow and programmatically generate the .mat file instead of archiving it
No strong feeling either way

@xyl96

xyl96 commented Mar 3, 2026

Copy link
Copy Markdown
Contributor Author

Not related to this PR specifically but more generally: please add README (don't hesitate to ask for help)
I think the main problem is storing all the mat files in the repo. Binary files cannot be tracked with git and they take a lot of space. It would probably make more sense to put them in Zenodo (separate repo) and download using datahugger? But you can ask @NiklasHohmann for a second opinion
In rastering.jl I get:

ERROR: KeyError: key "trackID" not found

Consider not tracking files that are generated in the code, just to keep the repo lean

Generally agree, keeping binary data on Zenodo and then autodownload it would be the smoothest option. Another option would be to automate the whole stacker workflow and programmatically generate the .mat file instead of archiving it No strong feeling either way

Not related to this PR specifically but more generally: please add README (don't hesitate to ask for help)
I think the main problem is storing all the mat files in the repo. Binary files cannot be tracked with git and they take a lot of space. It would probably make more sense to put them in Zenodo (separate repo) and download using datahugger? But you can ask @NiklasHohmann for a second opinion
In rastering.jl I get:

ERROR: KeyError: key "trackID" not found

Consider not tracking files that are generated in the code, just to keep the repo lean

Generally agree, keeping binary data on Zenodo and then autodownload it would be the smoothest option. Another option would be to automate the whole stacker workflow and programmatically generate the .mat file instead of archiving it No strong feeling either way

Yes, the results mat file is provided in the zenodo.

@EmiliaJarochowska

Copy link
Copy Markdown
Member

Yes, the results mat file is provided in the zenodo.

I think that wasn't the point of this comment ;-) The point was not to put them on GH but download from Zenodo

@xyl96

xyl96 commented Mar 3, 2026

Copy link
Copy Markdown
Contributor Author

Yes, the results mat file is provided in the zenodo.

I think that wasn't the point of this comment ;-) The point was not to put them on GH but download from Zenodo

Yes, the results mat file is provided in the zenodo.

I think that wasn't the point of this comment ;-) The point was not to put them on GH but download from Zenodo

ok, will add 'download' function in the plottingcores.jl

delete useless files
@EmiliaJarochowska

Copy link
Copy Markdown
Member

I need the current version of the data for #5 so I'll merge off this branch

@EmiliaJarochowska

Copy link
Copy Markdown
Member

In order to do this I need some info from @xyl96
My understanding is that FaciesMosaic.mat contains also the Z (time) dimension, so all model output in terms of facies.
So

facies = stratadata["facies"]
layers = stratadata["layers"]
creates arrays for facies codes and for thicknesses?
However, layers contains negative values:

julia> layers
100×100×1000 Array{Float64, 3}:
[:, :, 1] =
 -2.1681   -2.07705  -2.12417  -2.24804  …  -6.96664  -6.74394  -7.00432  -6.97156
 -1.82665  -2.04633  -2.02343  -2.1511      -6.7736   -6.80748  -6.90505  -6.96027
 -1.99421  -2.0791   -2.06234  -2.04844     -6.87824  -6.80586  -6.85596  -6.78426
 -2.25707  -1.90442  -2.14208  -2.29268     -6.84399  -6.86519  -6.7282   -7.11105
  ⋮                                      ⋱                                
 -2.1038   -1.98354  -2.10764  -2.32388     -6.81002  -6.76294  -7.03121  -6.84482
 -2.01357  -2.11726  -2.15233  -2.17706     -6.71198  -6.76788  -7.05157  -6.96443
 -2.01388  -2.10531  -2.10149  -2.16529     -6.68735  -6.95486  -6.91334  -6.95715

etc
so my questions:

  • how are these values measured? With reference to what?
  • why are there only 1000 values if there were 5000 time steps?

What I need for the power spectral densities are:

x(t) : a vector showing facies class at the sampling point as a function of time step
x(z) : a vector of the stratigraphic column, i.e. x resampled at uniform depth intervals
Can you add extracting them from Stacker, @xyl96 ? I thought I could do this but I stumbled on the above questions.

Actually what I am trying to do would probably benefit from some depth-to-time transfers of admtools by @NiklasHohmann but I guess it's easier to stick to Julia at this point

@EmiliaJarochowska

EmiliaJarochowska commented Mar 7, 2026

Copy link
Copy Markdown
Member

PS what I am trying to do is to compute $|H(k)|^2$ — the empirical power transfer function from the time series
to the depth series.

Because the depth series has a different length than the time series, their PSDs are computed on their own grids and then interpolated to a common spatial frequency axis before taking the ratio.

@xyl96

xyl96 commented May 19, 2026

Copy link
Copy Markdown
Contributor Author

In order to do this I need some info from @xyl96 My understanding is that FaciesMosaic.mat contains also the Z (time) dimension, so all model output in terms of facies. So

facies = stratadata["facies"]
layers = stratadata["layers"]

creates arrays for facies codes and for thicknesses?
However, layers contains negative values:

julia> layers
100×100×1000 Array{Float64, 3}:
[:, :, 1] =
 -2.1681   -2.07705  -2.12417  -2.24804  …  -6.96664  -6.74394  -7.00432  -6.97156
 -1.82665  -2.04633  -2.02343  -2.1511      -6.7736   -6.80748  -6.90505  -6.96027
 -1.99421  -2.0791   -2.06234  -2.04844     -6.87824  -6.80586  -6.85596  -6.78426
 -2.25707  -1.90442  -2.14208  -2.29268     -6.84399  -6.86519  -6.7282   -7.11105
  ⋮                                      ⋱                                
 -2.1038   -1.98354  -2.10764  -2.32388     -6.81002  -6.76294  -7.03121  -6.84482
 -2.01357  -2.11726  -2.15233  -2.17706     -6.71198  -6.76788  -7.05157  -6.96443
 -2.01388  -2.10531  -2.10149  -2.16529     -6.68735  -6.95486  -6.91334  -6.95715

etc so my questions:

  • how are these values measured? With reference to what?
  • why are there only 1000 values if there were 5000 time steps?

What I need for the power spectral densities are:

x(t) : a vector showing facies class at the sampling point as a function of time step x(z) : a vector of the stratigraphic column, i.e. x resampled at uniform depth intervals Can you add extracting them from Stacker, @xyl96 ? I thought I could do this but I stumbled on the above questions.

Actually what I am trying to do would probably benefit from some depth-to-time transfers of admtools by @NiklasHohmann but I guess it's easier to stick to Julia at this point

The layers means the 'water depth' so that's why they are negative. I think now it's 100 layers.
Now the code is able to download mat files from zenodo.

@xyl96 xyl96 requested a review from EmiliaJarochowska May 20, 2026 11:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants