This function identifies "dynamic" peaks, i.e. those that have high expression only in a subset of the cells. They are identified by overdispersion in the coefficient of variation (std.dev./mean) per quantiles.
Usage
identify_dynamic_peaks(
atac_mc,
method = "bmq",
plot = TRUE,
mean_thresh_q = 0.1,
cov_q_thresh = 0.75,
num_bins = 200,
gmm_g = 4
)
Arguments
- atac_mc
the McPeaks object to analyze
- method
(optional) either 'bmq' (default) or 'gmm'; 'bmq' (binned-mean quantiles) bins the log-mean of all peaks (averaged across metacells) and selects all peaks with a coefficient of variation above some quantile in each bin. More controlled 'gmm' fits a Gaussian mixture model to the log10(COV) vs. log10(mean) distribution, and selects peaks in clusters that show overdispersion in the COV.
- plot
plot the peak mean vs coefficient of variation (both in log10 scale). Note that it is highly recommended to look at the scatter plot before proceeding, so set this parameter to FALSE only after you made sure that the scatter looks reasonable.
- mean_thresh_q
(optional) threshold quantile on peaks' mean
- cov_q_thresh
(optional) threshold on minimum COV quantile to consider as dynamic in each bin
- num_bins
(optional) number of bins to divide features' means into
- gmm_g
(optional) number of groups for 'gmm'