Extract pwm of intervals from a motif database
gextract_pwm.Rd
Extract the pwm of each interval for each motif from a motif database. gextract_pwm_old
is an older version of this function, which is slower, and returns slightly different results due to float percision instead of double.
Usage
gextract_pwm_old(
intervals,
motifs = NULL,
dataset = all_motif_datasets(),
spat = NULL,
spat_min = 1,
spat_max = NULL,
bidirect = TRUE,
prior = 0.01,
func = "logSumExp",
parallel = getOption("prego.parallel", TRUE)
)
gextract_pwm(
intervals,
motifs = NULL,
dataset = MOTIF_DB,
spat = NULL,
spat_min = 1,
spat_max = NULL,
bidirect = TRUE,
prior = 0.01,
func = "logSumExp",
parallel = getOption("prego.parallel", TRUE)
)
Arguments
- intervals
misha intervals set
- motifs
names of specific motifs to extract from the dataset
- dataset
a data frame with PSSMs ('A', 'C', 'G' and 'T' columns), with an additional column 'motif' containing the motif name, for example
HOMER_motifs
orJASPAR_motifs
, orall_motif_datasets()
, or a MotifDB object.- spat
a data frame with the spatial model (as returned from the
$spat
slot from the regression). Should contain a column called 'bin' and a column called 'spat_factor'.- spat_min
the minimum position to use from the sequences. The default is 1.
- spat_max
the maximum position to use from the sequences. The default is the length of the sequences.
- bidirect
is the motif bi-directional. If TRUE, the reverse-complement of the motif will be used as well.
- prior
a prior probability for each nucleotide.
- func
the function to use to combine the PWMs for each sequence. Either 'logSumExp' or 'max'. The default is 'logSumExp'.
- parallel
logical, whether to use parallel processing
Value
The intervals set with additional columns per motif, containing the pwm of each interval for each motif
Examples
if (FALSE) { # \dontrun{
library(misha)
gdb.init_examples()
pwms <- gextract_pwm(gintervals.load("annotations"))
pwms[, 1:20]
} # }