Skip to contents

Generate random genome motif PSSM matrix

Usage

gen_random_genome_peak_motif_matrix(
  num_peaks = 100000,
  peak_width = 200,
  bp_from_chrom_edge_to_avoid = 3000000,
  datasets_of_interest = NULL,
  motif_regex = NULL,
  motif_tracks = NULL,
  parallel = TRUE
)

Arguments

num_peaks

total number of intervals (will be divided proportionately between chromosomes) to use as background

peak_width

(optional) - size of region around peak centers to extract motif energies for

bp_from_chrom_edge_to_avoid

regions (in bp) from edges of chromosomes to avoid sampling from (e.g. avoid acrocentric centromeres)

datasets_of_interest

(optional) - names of pssm datasets (name.key-name.data file combinations) to calculate PSSM values for

motif_regex

(optional) - a vector of regular expressions for which to match motif track names and extract motif PSSMs

motif_tracks

(optional) - misha track names for which to extract motif PSSMs

parallel

(optional) - whether to use parallel computations

Value

a matrix of peaks (rows) vs. aggregated motif energies (columns)

Examples

if (FALSE) {
random_genome_motifs <- gen_random_genome_peak_motif_matrix(num_peaks = 5e+4, datasets_of_interest = get_available_pssms(return_datasets_only = TRUE))
}