Create a track from tidy_cpgs files
gpatterns.import_from_tidy_cpgs(tidy_cpgs, track, description, steps = "all",
overwrite = TRUE, cov_filt_cmd = NULL, dsn = NULL, pat_cov_lens = c(3,
5, 7), max_span = 500, pat_freq_len = 2, nbins = nrow(gintervals.all()),
groot = GROOT, use_sge = FALSE, max_jobs = 400,
parallel = getOption("gpatterns.parallel"))
Arguments
- tidy_cpgs
- tidy_cpgs data frame or a vector with directories of tidy_cpgs (use full path)
- track
- name of the track to generate
- description
- description of the track to generate
- steps
- steps of the pipeline. Possible options are:
'bind_tidy_cpgs', 'pileup', 'pat_freq', 'pat_cov'
- overwrite
- overwrite existing tracks
- cov_filt_cmd
- if numeric - maximal coverage for CpG. Else - command for filtering highly (or lowly) covered CpGs. string with the maximal coverage, where 'covs' can represent the command, e.g. 'max(500, quantile(covs, 0.95))'.
- dsn
- downsampling n. Leave NULL for no downsampling
- pat_cov_lens
- lengthes of patterns to calculate pattern coverage track for
- max_span
- maximal span to look for patterns (usually the maximal insert length)
- pat_freq_len
- lengthes of patterns to calculate pattern frequency track
- nbins
- number of genomic bins to separate the analysis.
- groot
- root of misha genomic database to save the tracks
- use_sge
- use sun grid engine for parallelization
- max_jobs
- maximal number of jobs for sge parallelization
- parallel
- parallelize using threads (number of threads is determined by gpatterns.set_parallel)