Skip to contents

This function identifies "dynamic" peaks, i.e. those that have high expression only in a subset of the cells. They are identified by overdispersion in the coefficient of variation (std.dev./mean) per quantiles.

Usage

identify_dynamic_peaks(
  atac_mc,
  method = "bmq",
  plot = TRUE,
  mean_thresh_q = 0.1,
  cov_q_thresh = 0.75,
  num_bins = 200,
  gmm_g = 4
)

Arguments

atac_mc

the McPeaks object to analyze

method

(optional) either 'bmq' (default) or 'gmm'; 'bmq' (binned-mean quantiles) bins the log-mean of all peaks (averaged across metacells) and selects all peaks with a coefficient of variation above some quantile in each bin. More controlled 'gmm' fits a Gaussian mixture model to the log10(COV) vs. log10(mean) distribution, and selects peaks in clusters that show overdispersion in the COV.

plot

plot the peak mean vs coefficient of variation (both in log10 scale). Note that it is highly recommended to look at the scatter plot before proceeding, so set this parameter to FALSE only after you made sure that the scatter looks reasonable.

mean_thresh_q

(optional) threshold quantile on peaks' mean

cov_q_thresh

(optional) threshold on minimum COV quantile to consider as dynamic in each bin

num_bins

(optional) number of bins to divide features' means into

gmm_g

(optional) number of groups for 'gmm'

Value

a PeakIntervals object with peaks identified as dynamic. If plot = TRUE the selected points would plotted.

Examples

if (FALSE) {
dynamic_peaks_by_bmq <- identify_dynamic_peaks(atac_mc, method = "bmq", mean_thresh_q = 0.1, cov_q_thresh = 0.6, num_bins = 100)
dynamic_peaks_by_gmm <- identify_dynamic_peaks(atac_mc, method = "gmm", gmm_g = 3)
}