-
Notifications
You must be signed in to change notification settings - Fork 6
Speedup cosmodc2 #195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speedup cosmodc2 #195
Conversation
bsipocz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to get the rendering going on for this one for a better review. So I go ahead and add the minor fix and will return with a better review.
| display_name: Python 3 (ipykernel) | ||
| language: python | ||
| name: python3 | ||
| display_name: python3 | ||
| language: python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be great of not changing the order of these as it break CI as it currently is (I do use the name: python3 field in some matching with the assumption that it's the last one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry if that was me, and not jupyterlab!
|
|
||
| ```{code-cell} ipython3 | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ```{code-cell} ipython3 | |
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
| plt.show() | ||
| ``` | ||
|
|
||
| +++ {"jp-MarkdownHeadingCollapsed": true} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this was intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't even know what it is. could be I was clicking around and it did this without me realizing. It doesn't show up in the notebook view, so I never saw it.
This was just meant to be a comment-review, not an approval.
|
Well, we don't have a preview as ADS is down and that break the linkchecker 🤦♀️ |
troyraen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking this on. A few simple comments after reading through it.
| This tutorial demonstrates how to access and query the **CosmoDC2 Mock v1** catalogs using IRSA’s Table Access Protocol (TAP) service. Background information on the catalogs is available on the [IRSA CosmoDC2 page](https://irsa.ipac.caltech.edu/Missions/cosmodc2.html). | ||
|
|
||
| # Querying CosmoDC2 Mock v1 catalogs | ||
| The catalogs are served through IRSA’s Virtual Observatory–standard **TAP** interface (see the [IVOA TAP specification](https://www.ivoa.net/documents/TAP/)), which you can access programmatically in Python via the **PyVO** library. TAP queries are written in the **Astronomical Data Query Language (ADQL)** — a SQL-like language designed for astronomical catalogs (see the [ADQL specification](https://www.ivoa.net/documents/latest/ADQL.html)). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about linking to IRSA's TAP page instead of the IVOA specs? https://irsa.ipac.caltech.edu/docs/program_interface/TAP.html. IMO, our page gives more practical and easily applicable info. It also includes both of these links to the IVOA specs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
| ```{code-cell} ipython3 | ||
| import time | ||
| starttime = time.time() | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is helpful for us, but from the user perspective I feel this is clutter so would suggest not having it here. If leaving in, there is no endtime, which seems like an oversight.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, I just forgot to remove it
| import numpy as np | ||
| import matplotlib.mlab as mlab | ||
| import matplotlib.pyplot as plt | ||
| import time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| import time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
| # Now that we have a list of galaxy redshifts in that region, we can | ||
| # create a histogram of the redshifts to see what redshifts this survey includes. | ||
| if cone_results: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the if statement is doing anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, I have removed it here and below.
| # we will construct a 2D histogram rather than a scatter plot. | ||
| plt.hist2d(results['mag_true_r_sdss_z0'], results['mag_true_g_sdss_z0']-results['mag_true_r_sdss_z0'], | ||
| bins=200, cmap='plasma', cmax=500) | ||
| if redshift_results: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this if statement is doing anything either. Seems unlikely for the TAP query to return successfully but only return a blank table.
| # Show the plot. | ||
| plt.show() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary in a notebook.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I never understood that distinction, thanks. I removed it.
jkrick
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the reviews.
@bsipocz did you want to add more review comments? If not, this can be merged.
| display_name: Python 3 (ipykernel) | ||
| language: python | ||
| name: python3 | ||
| display_name: python3 | ||
| language: python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry if that was me, and not jupyterlab!
| plt.show() | ||
| ``` | ||
|
|
||
| +++ {"jp-MarkdownHeadingCollapsed": true} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't even know what it is. could be I was clicking around and it did this without me realizing. It doesn't show up in the notebook view, so I never saw it.
| This tutorial demonstrates how to access and query the **CosmoDC2 Mock v1** catalogs using IRSA’s Table Access Protocol (TAP) service. Background information on the catalogs is available on the [IRSA CosmoDC2 page](https://irsa.ipac.caltech.edu/Missions/cosmodc2.html). | ||
|
|
||
| # Querying CosmoDC2 Mock v1 catalogs | ||
| The catalogs are served through IRSA’s Virtual Observatory–standard **TAP** interface (see the [IVOA TAP specification](https://www.ivoa.net/documents/TAP/)), which you can access programmatically in Python via the **PyVO** library. TAP queries are written in the **Astronomical Data Query Language (ADQL)** — a SQL-like language designed for astronomical catalogs (see the [ADQL specification](https://www.ivoa.net/documents/latest/ADQL.html)). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
| ```{code-cell} ipython3 | ||
| import time | ||
| starttime = time.time() | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, I just forgot to remove it
| import numpy as np | ||
| import matplotlib.mlab as mlab | ||
| import matplotlib.pyplot as plt | ||
| import time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
| # Now that we have a list of galaxy redshifts in that region, we can | ||
| # create a histogram of the redshifts to see what redshifts this survey includes. | ||
| if cone_results: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, I have removed it here and below.
| # Show the plot. | ||
| plt.show() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I never understood that distinction, thanks. I removed it.
|
|
||
| ```{code-cell} ipython3 | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
| display_name: Python 3 (ipykernel) | ||
| language: python | ||
| name: python3 | ||
| execution: | ||
| timeout: 2600 | ||
| display_name: python3 | ||
| language: python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is back again, I suspect jupytext is doing it, and it is breaking the CI.
One workaround for now could be to not include this patch in the commit, e.g. adding changes with git add --patch. This will be a problem for all notebooks, not just this one.
bsipocz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more comments based on the rendering I get locally.
| ### Count the total number of redshifts in the chosen table | ||
| answer for this table= 597,488,849 redshifts | ||
| ``` | ||
| adql = f"SELECT count(redshift) FROM {tablename}" | ||
| ``` | ||
| *** | ||
|
|
||
| ### Count galaxies in a sky region (cone search) | ||
| Useful for: estimating source density, validating spatial footprint, testing spatial completeness | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could unify the comment's/subtext's format a little bit. Have it for all the examples; have it as a complete sentence, either keep the numerical values in all (where applicable) or none.
And have the code block be sql at all or at none (though I don't see we have syntax colouring for sql)
433d0a2 to
0eeccc2
Compare
0eeccc2 to
c859996
Compare
bsipocz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so skipping for #201 is working again, so while I hijacked this PR as a testing bed, this is now good to go in.
Speedup cosmodc2 30cf6cc
This will close #144.
@bsipocz please also let me know if this covers #132
One problem in those issues was a backend problem. Yesterday Judith did something to it, and now it is lightning fast, and not as error prone. This PR is therefore not a significant change to the original code, but since I was working on that notebook, and spending time waiting for queries to finish, I cleaned up/added a bunch of clarifying text. I also added a section on further query examples that we don't run, but for which we show users the syntax. This now includes the previous commented out section which did a
counton the table.The notebook now runs in 10s on Fornax small.
Here is what Judith did, for documentation purposes:
"I’m starting to suspect that the slurmctld daemon might be in a funny state, though systemctl status slurmctld shows it as “active”. There are archived slurmctld logs under /var/log/slurm but no active log."
"journalctl -u slurmctld -f shows nothing. I checked the path.
Okay, will restart it.
Yes, we have a slurmctld.log file now."