Generate random genome motif PSSM matrix

Usage

gen_random_genome_peak_motif_matrix(
  num_peaks = 100000,
  peak_width = 200,
  bp_from_chrom_edge_to_avoid = 3000000,
  datasets_of_interest = NULL,
  motif_regex = NULL,
  motif_tracks = NULL,
  parallel = TRUE
)

Arguments

num_peaks: total number of intervals (will be divided proportionately between chromosomes) to use as background
peak_width: (optional) - size of region around peak centers to extract motif energies for
bp_from_chrom_edge_to_avoid: regions (in bp) from edges of chromosomes to avoid sampling from (e.g. avoid acrocentric centromeres)
datasets_of_interest: (optional) - names of pssm datasets (name.key-name.data file combinations) to calculate PSSM values for
motif_regex: (optional) - a vector of regular expressions for which to match motif track names and extract motif PSSMs
motif_tracks: (optional) - misha track names for which to extract motif PSSMs
parallel: (optional) - whether to use parallel computations

Value

a matrix of peaks (rows) vs. aggregated motif energies (columns)

Examples

if (FALSE) {
random_genome_motifs <- gen_random_genome_peak_motif_matrix(num_peaks = 5e+4, datasets_of_interest = get_available_pssms(return_datasets_only = TRUE))
}