Calculate Kolmogorov-Smirnov D statistics between two interval sets with motif energies
Source:R/motifs.R
calculate_d_stats.Rd
This function does a one-sided KS test between a foreground set of peaks (pssm_fg
) and a background set pssm_bg
.
The option alternative == "less"
, checks the null hypothesis that the foreground distribution is not less than the
background distribution (applicable when looking for motif enrichment; for anti-enrichment, alternative == 'greater'
,
see ks.test documentation for further details)
Arguments
- pssm_fg
motif energies calculated for a certain set of motifs on a PeakIntervals/ScPeaks/McPeaks object
- pssm_bg
a background set of intervals (e.g. random genome, all ENCODE enhancers etc.) that include all/subset of the motifs (columns) in pssm_fg
- fg_clustering
a vector of cluster assignments for the foreground peaks (e.g. from
gen_atac_peak_clust
)- parallel
(optional) - whether to use parallelize computations
- alternative
indicates the alternative hypothesis and must be one of
"two.sided"
(default),"less"
, or"greater"
. You can specify just the initial letter of the value, but the argument name must be given in full. See ‘Details’ for the meanings of the possible values.- nc
(optional) - number of cores for parallel computations
Value
if fg_clustering == TRUE
, returns a matrix of clusters x motifs (rows x columns) with the D-statistic for each combination
Examples
if (FALSE) {
pssm_fg <- generate_motif_pssm_matrix(my_atac_mc, datasets_of_interest = "jaspar")
pssm_bg <- gen_random_genome_peak_motif_matrix(num_peaks = nrow(my_atac_mc@peaks), datasets_of_interest = "jaspar")
d_vs_rg <- calculate_d_stats(pssm_fg, pssm_bg)
peak_clust <- gen_atac_peak_clust(my_atac_mc, k = 12)
d_vs_rg_cl <- calculate_d_stats(pssm_fg, pssm_bg, fg_clustering = peak_clust)
}