Retruns a matrix of genomic coordinates over the metacells. The matrix is cached in memory, so subsequent calls to this function will not require the extraction of the reads from the underlying tracks.
When downsample
is TRUE, downsampling is done by first setting a coverage goal,
then the reads within the region are subsampled relative to the ratio of the metacell coverage and the goal, i.e. N_i * C_i/goal)
where C_i
is the total number of reads in metacell i, N_i
is the total numbers of reads in the region for metacell i, and goal
is the coverage goal.
For example, if the goal is 2M reads, a metacell has a total of 1M reads and 50K reads within the region, the reads within the region are subsampled to 50K * 1M/2M = 25K
(50%). Metacells with less than the goal are removed.
Usage
mct_get_mat(
mct,
intervals,
downsample = FALSE,
downsample_n = NULL,
force = FALSE,
seed = NULL
)
Arguments
- mct
a McTrack object
- intervals
an intervals set with a single interval. Note that if the start or end coordinates are not divisible by the resolution, the region will be extended to the next resolution interval.
- downsample
return a downsampled matrix. See description.
- downsample_n
total coverage goal. See description. Default: lower 5th percentile of the total coverage)
- force
force the computation of the matrix. If FALSE, the matrix is retrieved from the cache if it exists.
- seed
random seed for the downsampling.