Writes a misha fixed-bin dense track whose value at each output bin
is the summed natural-log conditional probability of the reference
sequence under the trained Markov model:
$$T(a) = \sum_{i=a}^{a+r-1} \log P_M(\mathrm{seq}[i] \mid
\mathrm{seq}[i-k..i-1], b(i))$$
where \(b(i)\) is the model's stratum bin (constant within each
model$iterator-bp window). The first \(k\) bp of every
chromosome are NA (no upstream context available); under default
policies, any output bin containing an NA per-bp contribution is NA.
A gsynth.model object (from
gsynth.train).
Name of the misha track to create.
Optional track description.
Intervals to score. Defaults to
gintervals.all(). Best results when interval
starts are aligned to multiples of
model$iterator; otherwise the first stratum
window is shorter than model$iterator and
its bin label may differ from training.
Optional intervals to NA-out in the output (e.g.
repeats). Intersects with intervals per bp;
every output bin containing a masked bp becomes
NaN.
Output bin size in bp. Defaults to
model$iterator. 1 produces a per-bp
track; any positive integer is allowed.
How to score positions whose stratum bin is
sparse in the model:
"NA" (default) propagates NA;
"uniform" contributes log(1/4) per
bp.
How to score positions whose k-mer context
contains an N: "NA" (default) or
"uniform" (log(1/4) per bp). The
predicted base itself is always NA when N — the
model has no \(\log P\) for non-ACGT bases.
If TRUE, replace an existing track of the
same name.
Invisibly NULL. Side effect: creates the named misha
track. Output bins are written as NaN where the sum
is undefined (out of intervals, or any per-bp NA).
Two models scored against the same reference give a windowed log Bayes factor as a track expression:
gsynth.score(genome_model, "genome_score")
gsynth.score(cre_model, "cre_score")
gextract("cre_score - genome_score", iterator=1000, ...)