Calculate Kolmogorov-Smirnov D statistics between two interval sets with motif energies
Source:R/motifs.R
calculate_d_stats.RdThis function does a one-sided KS test between a foreground set of peaks (pssm_fg) and a background set pssm_bg.
The option alternative == "less", checks the null hypothesis that the foreground distribution is not less than the
background distribution (applicable when looking for motif enrichment; for anti-enrichment, alternative == 'greater',
see ks.test documentation for further details)
Arguments
- pssm_fg
motif energies calculated for a certain set of motifs on a PeakIntervals/ScPeaks/McPeaks object
- pssm_bg
a background set of intervals (e.g. random genome, all ENCODE enhancers etc.) that include all/subset of the motifs (columns) in pssm_fg
- fg_clustering
a vector of cluster assignments for the foreground peaks (e.g. from
gen_atac_peak_clust)- parallel
(optional) - whether to use parallelize computations
- alternative
indicates the alternative hypothesis and must be one of
"two.sided"(default),"less", or"greater". You can specify just the initial letter of the value, but the argument name must be given in full. See ‘Details’ for the meanings of the possible values.- nc
(optional) - number of cores for parallel computations
Value
if fg_clustering == TRUE, returns a matrix of clusters x motifs (rows x columns) with the D-statistic for each combination
Examples
if (FALSE) {
pssm_fg <- generate_motif_pssm_matrix(my_atac_mc, datasets_of_interest = "jaspar")
pssm_bg <- gen_random_genome_peak_motif_matrix(num_peaks = nrow(my_atac_mc@peaks), datasets_of_interest = "jaspar")
d_vs_rg <- calculate_d_stats(pssm_fg, pssm_bg)
peak_clust <- gen_atac_peak_clust(my_atac_mc, k = 12)
d_vs_rg_cl <- calculate_d_stats(pssm_fg, pssm_bg, fg_clustering = peak_clust)
}