Creates a new virtual track.
emr_vtrack.create(
vtrack,
src,
func = NULL,
params = NULL,
keepref = FALSE,
time.shift = NULL,
id.map = NULL,
filter = NULL
)
virtual track name. If 'NULL' is used, a unique name is generated.
data source. either a track name or a list of two members: ID-Time Values table (see "User Manual") and a logical. If the logical is 'TRUE', the data in the table is treated as categorical, otherwise as quantitative.
see below.
see below.
time shift and expansion for iterator time.
id mapping.
virtual track filter. Note that filters with a source of another virtual track are not allowed in order to avoid loops.
Name of the virtual track (invisibly)
This function creates a new virtual track named 'vtrack'.
During the evaluation of track expression that contains a virtual track 'vtrack' the iterator point of id-time (ID1, Time, Ref) form is transformed first to an id-time interval: (ID2, Time1, Time2, Ref).
If 'id.map' is 'NULL' then ID1 == ID2, otherwise ID2 is derived from the translation table provided in 'id.map'. This table is a data frame with two first columns named 'id1' and 'id2', where 'id1' is mapped to 'id2'. If 'id.map' contains also a third optional column named 'time.shift' the value V of this column is used to shift the time accordingly, i.e. Time1 = Time2 = Time + V.
'time.shift' parameter (not to be confused with 'time.shift' column of 'id.map') can be either a single number X, in which case Time1 = Time2 = Time + X. Alternatively 'time.shift' can be a vector of two numbers, i.e. 'c(X1, X2)', which would result in Time1 = Time + X1, Time2 = Time + X2.
Both 'time.shift' parameter and 'time.shift' column within 'id.map' may be used simultaneously. In this case the time shifts are applied sequentially.
At the next step values from the data source 'src' that fall into the new id-time interval and pass the 'filter' are collected. 'src' may be either a track name or a list of two members: ID-Time Values table (see "User Manual") and a logical. If the logical is 'TRUE', the data in the table is treated as categorical, otherwise as quantitative.
If 'keepref' is 'TRUE' the reference of these values must match 'ref' unless either the reference or 'ref' are '-1'.
Function 'func' (with 'params') is applied then on the collected values and produces a single value which is considered to be the value of 'vtrack' for the given iterator point. If 'NULL' is used as a value for 'func', 'func' is set then implicitly to 'value', if the data source is categorical, or 'avg', if the data source is quantitative.
Use the following table for a reference of all valid functions and parameters combinations.
CATEGORICAL DATA SOURCE
FUNC | PARAM | DESCRIPTION |
value | vals/NULL | A source value or -1 if there is more than one. |
exists | vals | 1 if any of the 'vals' exist otherwise 0. |
sample | NULL | Uniformly sampled source value. |
sample.time | NULL | Time of the uniformly sampled source value. |
frequent | vals/NULL | The most frequent source value or -1 if there is more than one value. |
size | vals/NULL | Number of values. |
earliest | vals/NULL | Earliest value or -1 if there is more than one. |
latest | vals/NULL | Latest value or -1 if there is more than one. |
closest | vals/NULL | Values closest to the middle of the interval or -1 if there is more than one. |
earliest.time | vals/NULL | Time of the earliest value. |
latest.time | vals/NULL | Time of the latest value. |
closest.earlier.time | vals/NULL | Time of the of the earlier of the closest values. |
closest.later.time | vals/NULL | Time of the of the later of the closest values. |
dt1.earliest | vals/NULL | Time difference between the earliest value and T1 |
dt1.latest | vals/NULL | Time difference between the latest value and T1 |
dt2.earliest | vals/NULL | Time difference between T2 and the earliest value |
dt2.latest | vals/NULL | Time difference between T2 and the latest value |
* 'vals' is a vector of values. If not 'NULL' it serves as a filter: the function is applied only to the data source values that appear among 'vals'. 'vals' can be a single NA value, in which case all the values of the track would be filtered out.
QUANTITATIVE DATA SOURCE
FUNC | PARAM | DESCRIPTION |
avg | NULL | Average of all values. |
min | NULL | Minimal value. |
max | NULL | Maximal value. |
sample | NULL | Uniformly sampled source value. |
sample.time | NULL | Time of the uniformly sampled source value. |
size | NULL | Number of values. |
earliest | NULL | Average of the earliest values. |
latest | NULL | Average of the latest values. |
closest | NULL | Average of values closest to the middle of the interval. |
stddev | NULL | Unbiased standard deviation of the values. |
sum | NULL | Sum of values. |
quantile | Percentile in the range of [0, 1] | Quantile of the values. |
percentile.upper | NULL | Average of upper-bound values percentiles.* |
percentile.upper.min | NULL | Minimum of upper-bound values percentiles.* |
percentile.upper.max | NULL | Maximum of upper-bound values percentiles.* |
percentile.lower | NULL | Average of lower-bound values percentiles.* |
percentile.lower.min | NULL | Minimum of lower-bound values percentiles.* |
percentile.lower.max | NULL | Maximum of lower-bound values percentiles.* |
lm.intercept | NULL | Intercept (aka "alpha") of the simple linear regression (X = time, Y = values) |
lm.slope | NULL | Slope (aka "beta") of the simple linear regression (X = time, Y = values) |
earliest.time | NULL | Time of the earliest value. |
latest.time | NULL | Time of the latest value. |
closest.earlier.time | NULL | Time of the of the earlier of the closest values. |
closest.later.time | NULL | Time of the of the later of the closest values. |
dt1.earliest | NULL | Time difference between the earliest value and T1 |
dt1.latest | NULL | Time difference between the latest value and T1 |
dt2.earliest | NULL | Time difference between T2 and the earliest value |
dt2.latest | NULL | Time difference between T2 and the latest value |
* Percentile is calculated based on the values of the whole data source even if a subset or a filter are defined.
Note: 'time.shift' can be used only when 'keepref' is 'FALSE'. Also when 'keepref' is 'TRUE' only 'avg', 'percentile.upper' and 'percentile.lower' can be used in 'func'.
emr_db.init_examples()
#> NULL
emr_vtrack.create("vtrack1", "dense_track",
time.shift = 1,
func = "max"
)
emr_vtrack.create("vtrack2", "dense_track",
time.shift = c(-5, 10), func = "min"
)
res <- emr_extract("dense_track", keepref = TRUE, names = "value")
emr_vtrack.create("vtrack3", list(res, FALSE),
time.shift = c(-5, 10),
func = "min"
)
emr_extract(c("dense_track", "vtrack1", "vtrack2", "vtrack3"),
keepref = TRUE, iterator = "dense_track"
)
#> id time ref dense_track vtrack1 vtrack2 vtrack3
#> 1 22 1 3 13 NaN 13 13
#> 2 24 1 3 13 NaN 13 13
#> 3 25 1 0 10 28 10 10
#> 4 25 2 0 20 34 10 10
#> 5 25 2 2 22 34 10 10
#> 6 25 2 5 24 34 10 10
#> 7 25 2 6 26 34 10 10
#> 8 25 2 8 28 34 10 10
#> 9 25 3 4 34 NaN 10 10
#> 10 25 6 0 60 NaN 10 10
#> 11 25 6 2 62 NaN 10 10
#> 12 25 8 1 80 94 34 34
#> 13 25 8 4 84 94 34 34
#> 14 25 9 2 92 104 60 60
#> 15 25 9 4 94 104 60 60
#> 16 25 10 4 104 NaN 60 60
#> 17 25 12 4 124 NaN 80 80
#> 18 27 23 4 234 NaN 234 234
#> 19 27 50 0 500 NaN 500 500
#> 20 28 1 3 13 NaN 13 13