Creates a new Genomic Database

Creates a new Genomic Database.

gdb.create(
  groot = NULL,
  fasta = NULL,
  genes.file = NULL,
  annots.file = NULL,
  annots.names = NULL
)

Arguments

groot: path to newly created database
fasta: an array of names or URLs of FASTA files. Can contain wildcards for multiple files
genes.file: name or URL of file that contains genes. If 'NULL' no genes are imported
annots.file: name of URL file that contains annotations. If 'NULL' no annotations are imported
annots.names: annotations names

Value

None.

Details

This function creates a new Genomic Database at the location specified by 'groot'. FASTA files are converted to 'Seq' format and appropriate 'chrom_sizes.txt' file is generated (see "User Manual" for more details).

If 'genes.file' is not 'NULL' four sets of intervals are created in the database: tss, exons, utr3 and utr5. See gintervals.import_genes for more details about importing genes intervals.

'fasta', 'genes.file' and 'annots.file' can be either a file path or URL in a form of 'ftp://[address]/[file]'. 'fasta' can also contain wildcards to indicate multiple files. Files that these arguments point to can be zipped or unzipped.

See the 'Genomes' vignette for details on how to create a database from common genome sources.

Examples

# \donttest{
ftp <- "ftp://hgdownload.soe.ucsc.edu/goldenPath/mm10"
mm10_dir <- file.path(tempdir(), "mm10")
# only a single chromosome is loaded in this example
# see "Genomes" vignette how to download all of them and how
# to download other genomes
gdb.create(
    mm10_dir,
    paste(ftp, "chromosomes", paste0(
        "chr", c("X"),
        ".fa.gz"
    ), sep = "/"),
    paste(ftp, "database/knownGene.txt.gz", sep = "/"),
    paste(ftp, "database/kgXref.txt.gz", sep = "/"),
    c(
        "kgID", "mRNA", "spID", "spDisplayID", "geneSymbol",
        "refseq", "protAcc", "description", "rfamAcc",
        "tRnaName"
    )
)
#> Downloading ftp://hgdownload.soe.ucsc.edu/goldenPath/mm10/chromosomes/chrX.fa.gz
#> Building Seq files...
#> chrX
#> Downloading ftp://hgdownload.soe.ucsc.edu/goldenPath/mm10/database/knownGene.txt.gz
#> Downloading ftp://hgdownload.soe.ucsc.edu/goldenPath/mm10/database/kgXref.txt.gz
#> Database was successfully created
gdb.init(mm10_dir)
gintervals.ls()
#> [1] "exons" "tss"   "utr3"  "utr5" 
gintervals.all()
#>   chrom start       end
#> 1  chrX     0 171031299
# }

Arguments

Value

Details

See also

Examples