Returns the result of track expressions evaluation for each of the iterator intervals.

gextract(
  ...,
  intervals = NULL,
  colnames = NULL,
  iterator = NULL,
  band = NULL,
  file = NULL,
  intervals.set.out = NULL
)

Arguments

...

track expression

intervals

genomic scope for which the function is applied

colnames

sets the columns names in the returned value. If 'NULL' names are set to track expression.

iterator

track expression iterator. If 'NULL' iterator is determined implicitly based on track expressions.

band

track expression band. If 'NULL' no band is used.

file

file name where the function result is optionally outputted in tab-delimited format

intervals.set.out

intervals set name where the function result is optionally outputted

Value

If 'file' and 'intervals.set.out' are 'NULL' a set of intervals with an additional column for each of the track expressions and 'columnID' column.

Details

This function returns the result of track expressions evaluation for each of the iterator intervals. The returned value is a set of intervals with an additional column for each of the track expressions. This value can be used as an input for any other function that accepts intervals. If the intervals inside 'intervals' argument overlap gextract returns the overlapped coordinate more than once.

The order inside the result might not be the same as the order of intervals. An additional column 'intervalID' is added to the return value. Use this column to refer to the index of the original interval from the supplied 'intervals'.

If 'file' parameter is not 'NULL' the result is outputted to a tab-delimited text file (without 'intervalID' column) rather than returned to the user. This can be especially useful when the result is too big to fit into the physical memory. The resulted file can be used as an input for 'gtrack.import' or 'gtrack.array.import' functions.

If 'intervals.set.out' is not 'NULL' the result is saved as an intervals set. Similarly to 'file' parameter 'intervals.set.out' can be useful to overcome the limits of the physical memory.

'colnames' parameter controls the names of the columns that contain the evaluated expressions. By default the column names match the track expressions.

Examples

# \dontshow{
options(gmax.processes = 2)
# }

gdb.init_examples()

## get values of 'dense_track' for [0, 500), chrom 1
gextract("dense_track", gintervals(1, 0, 500))
#>    chrom start end dense_track intervalID
#> 1   chr1     0  50   0.1777778          1
#> 2   chr1    50 100   0.1600000          1
#> 3   chr1   100 150   0.1800000          1
#> 4   chr1   150 200   0.1600000          1
#> 5   chr1   200 250   0.1600000          1
#> 6   chr1   250 300   0.2000000          1
#> 7   chr1   300 350   0.1600000          1
#> 8   chr1   350 400   0.1600000          1
#> 9   chr1   400 450   0.1600000          1
#> 10  chr1   450 500   0.0600000          1

## get values of 'rects_track' (a 2D track) for a 2D interval
gextract(
    "rects_track",
    gintervals.2d("chr1", 0, 4000, "chr2", 2000, 5000)
)
#> NULL

## get values of two track expressions 'dense_track' and
## 'array_track * 2' running over '100' iterator
gextract("dense_track", "array_track * 2", gintervals(1, 0, 500),
    iterator = 100, colnames = c("expr1", "expr2")
)
#>   chrom start end     expr1    expr2 intervalID
#> 1  chr1     0 100 0.1688889   8.0000          1
#> 2  chr1   100 200 0.1700000 212.3333          1
#> 3  chr1   200 300 0.1800000 407.2000          1
#> 4  chr1   300 400 0.1600000      NaN          1
#> 5  chr1   400 500 0.1100000      NaN          1