Download fragments data
We will start by downloading the fragments data (and it’s index) from 10x:
if (!dir.exists("pbmc_data") || !file.exists("pbmc_data/fragments.tsv.gz") || !file.exists("pbmc_data/fragments.tsv.gz.tbi")) {
download_pbmc_example_data(fragments = TRUE)
}
This would download the fragments file and its index from:
PBMC from a healthy donor - granulocytes removed through cell sorting (10k)
Note that the ‘fragments’ file is large - ~2GB, so this step might take some time.
Import raw reads into a ScCounts format
mcATAC
allows you to load the raw counts data into an ScCounts
object. A ScCounts
holds the raw reads as a set of sparse matrices where the rows are genomic coordinates and the columns are cell barcodes. The number of sparse matrices (“genomic bins”) is controlled by the bin_size
parameter. The matrices are stored in a ‘gzip’ compressed form under a directory called ‘data’, together with a ‘yaml’ file containing the some metadata such as the genomic bins coordinates and the cell barcodes.
We can now import the raw reads into a ScCounts
object that we are going to store at “pbmc_data/reads”. Note that this might take ~4 minutes on a machine with 96 cores.
barcodes <- tgutil::fread("pbmc_data/barcodes.csv", header = FALSE)[, 1]
scc_from_fragments("pbmc_data/fragments.tsv.gz", "pbmc_data/reads", cell_names = barcodes, genome = "hg38", overwrite = TRUE)
If you have multiple batches, please add a suffix to the cell names using the cell_suffix
parameter, e.g.:
scc_from_fragments("pbmc_data/fragments.tsv.gz", "pbmc_data/reads", cell_names = barcodes, cell_suffix = "batch1", genome = "hg38", overwrite = TRUE)
Load ScCounts object
sc_counts <- scc_read("pbmc_data/reads")
sc_counts
Project RNA metacells
We can now project metacells derived from the RNA data (or any other way) into an McCounts
object which would hold the counts per metacell:
data(cell_to_metacell_pbmc_example)
mc_counts <- scc_to_mcc(sc_counts, cell_to_metacell_pbmc_example)
mc_counts
Save McCounts object
mcc_write(mc_counts, "pbmc_data/mc_counts", overwrite = TRUE)
Create ‘misha’ tracks for each metacell
We can now create a ‘misha’ track for each metacell:
mct <- mcc_to_tracks(mc_counts, "pbmc_mc", overwrite = TRUE, create_marginal_track = TRUE, window_size = NULL, resolution = 20)
mct
The window size paramter controls the smoothing of the tracks. If it set to NULL, the tracks will not be smoothed, except for the resolution.
We can now use the tracks we have created to identify candidate regularoty elements (CREs) for ‘trans’ analysis, or to analyse the genomic regions in ‘cis’.
Identify CREs (candidate regulatory elements)
We can now use the normalized marginal track we created in order to identify CREs:
cres <- call_peaks("pbmc_mc.marginal_normed", quantile_thresh = 0.9, min_umis = 8, target_size = 300, genome = "hg38")
We can diagnose our peak calling by plotting the marginal track:
plot_marginal_coverage("pbmc_mc.marginal_normed", interval = cres[967, ], window_size = 500, peaks = cres, expand = 1e4, show_thresh = TRUE, quantile_thresh = 0.9, min_umis = 8, genome = "hg38", log_scale = FALSE)
Create a McPeaks object from McCounts object
After we have indentified the CREs, we can create a McPeaks
object from the McCounts
object that we have created.
atac_mc <- mcc_to_mcatac(mc_counts, cres)
atac_mc
Create a McTracks object
We already created a ‘McTracks’ object when we created the tracks from the counts object, but we can also create it without re-generating the tracks:
mct <- mct_create(genome = "hg38", track_prefix = "pbmc_mc", id = mc_counts@id, description = mc_counts@description)
Add RNA and metadata to the McTracks object
We can now add the RNA and metadata to the McTracks object:
data(rna_mc_mat)
mct <- add_mc_rna(mct, rna_mc_mat)
An important metadata field is the ‘cell_type’ field. In addition, a field named ‘color’ can be added with a color for each cell type.
data(mcmd)
head(mcmd)
mct <- add_mc_metadata(mct, mcmd)
Plot a genomic region
mct_plot_region(mct, gintervals(5, 54513252, 55535253), detect_dca = TRUE, gene_annot = TRUE)
Create a shiny app
run_app(mct)
Deploy the shiny app
create_bundle(mct, path = "/path/to/the/server/pbmc_app", overwrite = TRUE, restart = TRUE)