Creates a track from bam files.

gpatterns.import_from_bam(bams, workdir = NULL, track = NULL,
  steps = "all", paired_end = TRUE, cgs_mask_file = NULL, trim = NULL,
  umi1_idx = NULL, umi2_idx = NULL, use_seq = FALSE, only_seq = FALSE,
  frag_intervs = NULL, maxdist = 0, rm_off_target = TRUE,
  add_chr_prefix = FALSE, bismark = FALSE, nbins = nrow(gintervals.all()),
  groot = GROOT, import_raw_tcpgs = FALSE, use_sge = FALSE,
  max_jobs = 400, parallel = getOption("gpatterns.parallel"),
  cmd_prefix = "", run_per_interv = TRUE, ...)

Arguments

bams
character vector with path of bam files
workdir
directory in which the files would be saved (please provide full path)
track
name of the track to generate
steps
steps of the pipeline to do. Possible options are: 'bam2tidy_cpgs', 'filter_dups', 'bind_tidy_cpgs', 'pileup', 'pat_freq', 'pat_cov'
paired_end
bam files are paired end, with R1 and R2 interleaved
cgs_mask_file
comma separated file with positions of cpgs to mask (e.g. MSP1 sticky ends). Needs to have chrom and start fields with the position of 'C' in the cpgs to mask
trim
trim cpgs that are --trim bp from the beginning/end of the read
umi1_idx
position of umi1 in index (0 based)
umi2_idx
position of umi2 in index (0 based)
use_seq
use UMI sequence (not only position) to filter duplicates
only_seq
use only UMI sequence (without positions) to filter duplicates
frag_intervs
intervals set of the fragments to change positions to.
maxdist
maximal distance from fragments
rm_off_target
if TRUE - remove reads with distance > maxdist from frag_intervs if FALSE - those reads would be left unchanged
add_chr_prefix
add "chr" prefix for chromosomes (in order to import to misha)
bismark
bam was aligned using bismark
nbins
number of genomic bins to separate the analysis.
groot
root of misha genomic database to save the tracks
import_raw_tcpgs
import raw tidy cpgs to misha (without filtering duplicates)
use_sge
use sun grid engine for parallelization
max_jobs
maximal number of jobs for sge parallelization
parallel
parallelize using threads (number of threads is determined by gpatterns.set_parallel)
cmd_prefix
prefix to run on 'system' commands (e.g. source ~/.bashrc)
run_per_interv
split run of bam2tidy_cpgs scripts separatly for each interval.
...
gpatterns.import_from_tidy_cpgs parameters

Value

if 'stats' is one of the steps - data frame with statistics. Otherwise none.