Creates a track from bam files.
gpatterns.import_from_bam(bams, workdir = NULL, track = NULL,
steps = "all", paired_end = TRUE, cgs_mask_file = NULL, trim = NULL,
umi1_idx = NULL, umi2_idx = NULL, use_seq = FALSE, only_seq = FALSE,
frag_intervs = NULL, maxdist = 0, rm_off_target = TRUE,
add_chr_prefix = FALSE, bismark = FALSE, nbins = nrow(gintervals.all()),
groot = GROOT, import_raw_tcpgs = FALSE, use_sge = FALSE,
max_jobs = 400, parallel = getOption("gpatterns.parallel"),
cmd_prefix = "", run_per_interv = TRUE, ...)
Arguments
- bams
- character vector with path of bam files
- workdir
- directory in which the files would be saved (please provide full path)
- track
- name of the track to generate
- steps
- steps of the pipeline to do. Possible options are:
'bam2tidy_cpgs', 'filter_dups', 'bind_tidy_cpgs', 'pileup', 'pat_freq', 'pat_cov'
- paired_end
- bam files are paired end, with R1 and R2 interleaved
- cgs_mask_file
- comma separated file with positions of cpgs to mask
(e.g. MSP1 sticky ends). Needs to have chrom and start fields with the
position of 'C' in the cpgs to mask
- trim
- trim cpgs that are --trim bp from the beginning/end of the read
- umi1_idx
- position of umi1 in index (0 based)
- umi2_idx
- position of umi2 in index (0 based)
- use_seq
- use UMI sequence (not only position) to filter duplicates
- only_seq
- use only UMI sequence (without positions) to filter duplicates
- frag_intervs
- intervals set of the fragments to change positions to.
- maxdist
- maximal distance from fragments
- rm_off_target
- if TRUE - remove reads with distance > maxdist from frag_intervs
if FALSE - those reads would be left unchanged
- add_chr_prefix
- add "chr" prefix for chromosomes (in order to import to misha)
- bismark
- bam was aligned using bismark
- nbins
- number of genomic bins to separate the analysis.
- groot
- root of misha genomic database to save the tracks
- import_raw_tcpgs
- import raw tidy cpgs to misha (without filtering duplicates)
- use_sge
- use sun grid engine for parallelization
- max_jobs
- maximal number of jobs for sge parallelization
- parallel
- parallelize using threads (number of threads is determined by gpatterns.set_parallel)
- cmd_prefix
- prefix to run on 'system' commands (e.g. source ~/.bashrc)
- run_per_interv
- split run of bam2tidy_cpgs scripts separatly for each interval.
- ...
- gpatterns.import_from_tidy_cpgs parameters
Value
if 'stats' is one of the steps - data frame with statistics. Otherwise none.