Skip to contents

This function infers the motif energies for a set of peaks using a pre-trained trajectory model.

Usage

infer_trajectory_motifs(
  traj_model,
  peak_intervals,
  atac_scores = NULL,
  bin_start = 1,
  bin_end = ncol(atac_scores),
  additional_features = NULL,
  test_energies = NULL,
  diff_score = NULL,
  sequences = NULL,
  norm_sequences = NULL
)

Arguments

traj_model

A trajectory model object, as returned by regress_trajectory_motifs

peak_intervals

A data frame, indicating the genomic positions ('chrom', 'start', 'end') of each peak, with an additional column named "const" indicating whether the peak is constitutive. Optionally, a column named "cluster" can be added with indication of the cluster of each peak.

atac_scores

Optional. A numeric matrix, representing mean ATAC score per bin per peak. Rows: peaks, columns: bins. By default iceqream would regress the last column minus the first column. If you want to regress something else, please either set bin_start or bin_end, or provide atac_diff instead. If normalize_bins is TRUE, the scores will be normalized to 0, 1.

bin_start

the start of the trajectory. Default: 1

bin_end

the end of the trajectory. Default: the last bin (only used when atac_scores is provided)

additional_features

A data frame, representing additional genomic features (e.g. CpG content, distance to TSS, etc.) for each peak. Note that NA values would be replaced with 0.

test_energies

An already computed matrix of motif energies for the test peaks. An advanced option to provide the energies directly.

diff_score

The difference in ATAC-seq scores between the end and start of the peak. If provided, the function will ignore the atac_scores parameter.

sequences

A vector of strings containing the sequences of the peaks. If not provided, the sequences will be extracted from the genome using the peak intervals.

norm_sequences

A vector of strings containing the sequences of the normalization intervals. If not provided, the sequences will be extracted from the genome using the normalization intervals.

Value

a TrajectoryModel object which contains both the original ('train') peaks and the newly inferred ('test') peaks. The field @type indicates whether a peak is a 'train' or 'test' peak. R^2 statistics are computed at object@params$stats.