Skip to content

Update osteo data prep#990

Open
sjspielman wants to merge 11 commits into
masterfrom
sjspielman/987-prep-osteo-ref
Open

Update osteo data prep#990
sjspielman wants to merge 11 commits into
masterfrom
sjspielman/987-prep-osteo-ref

Conversation

@sjspielman
Copy link
Copy Markdown
Member

Closes #987
Partly addresses #988

This PR updates the Snakemake workflow that prepares osteo data in several ways:

  1. I updated how we read in the osteo SPE to ensure the image is actually preserved, and it seems to work. The code is more verbose but it is what it is!
  2. I fixed a bug in the existing notebook/pipeline that was specifying the human mito genes file. This is in fact a mouse dataset, and also (as now explained) there are no mito genes anyways! I still think it's appropriate to read it in for posterity though, hence the added text in the notebook.
  3. I added 2 additional steps to the Snakefile (and some config mods as well), to a) download and b) reformat the osteocar mouse metastasis reference (subset to 10000 cells per annotation, and convert rownames to symbols).
    • I want to note the qs situation - authors fell out of compliance with CRAN so the package got removed, and qs2 doesn't work to read in files created with qs. This is a known annoyance for a lot of people! Now we too can be annoyed.
  4. I updated paths in general to use local paths, for two reasons:
    • This more closely matches other setup code which prepares into local directories, not /shared
    • I am currently running/developing locally with the workshop coming up, juuust in case something needs to happen with the server. Relative paths are just better.
  5. I updated the Snakefile to used pathlib, which is nicer to look at and work with.

@sjspielman
Copy link
Copy Markdown
Member Author

Woops, forgot to request review here amid all the paper things I've been up to!

@sjspielman sjspielman requested a review from jashapiro May 8, 2026 13:28
Copy link
Copy Markdown
Member

@jashapiro jashapiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks good. Sorry that it took a bunch of extra work to make it useable!
My main notes are about updating the renv to include qs and a suggestion about downloading files with snakemake.

```

Turns out there is no overlap, which is not actually unexpected.
This data is from Visium and has 19465 genes, corresponding to the Mouse Visium v1 probe set which does not have mitochondrial genes.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<tapping-head.gif> Can't have low quality spots if you don't look at markers of low quality.

make_option(
opt = c("-i", "--input"),
type = "character",
help = "Path to the input osteo reference RDS file."
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
help = "Path to the input osteo reference RDS file."
help = "Path to the input osteo reference qs file."

Comment thread spatial/setup/README.md
Comment on lines +83 to +84
This code requires the `qs` (not `qs2`) package to read in the raw `OsteoCAR` reference.
As of January 2026, this package is not currently from CRAN (but it may be back one day), so you may need to install with `remotes::install_github("qsbase/qs")`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be part of the renv. (also why we don't use non-standard formats!)

# for the deconvolution notebooks
rule reformat_osteo_ref:
input:
osteo_ref = "mm_mets_osteo_ref_raw.qs", # doesn't need temp()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inputs are never flagged as temp, but I also wouldn't set this file as temp at all? Is there a reason you would want to not keep the file around?

Alternatively, you can let snakemake handle the download directly, which may be preferable here (see https://snakemake.readthedocs.io/en/v6.1.0/snakefiles/remote_files.html)

You would need to add the following at the top:

from snakemake.remote.HTTP import RemoteProvider as HTTPRemoteProvider
HTTP = HTTPRemoteProvider()

Then you can replace this with:

Suggested change
osteo_ref = "mm_mets_osteo_ref_raw.qs", # doesn't need temp()
osteo_ref = HTTP.remote(config["osteo_ref_url"])

and remove the manual download.

@@ -0,0 +1,87 @@
#!/usr/bin/env Rscript

# This script reformats the osteo reference:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# This script reformats the osteo reference:
# This script reformats the osteo reference Seurat object

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Expand osteo data preprocessing for deconvolution

2 participants