Calculates quantiles of a track expression for the given percentiles.
emr_quantiles(
expr,
percentiles = 0.5,
stime = NULL,
etime = NULL,
iterator = NULL,
keepref = FALSE,
filter = NULL
)
track expression
an array of percentiles of quantiles in [0, 1] range
start time scope
end time scope
track expression iterator. If 'NULL' iterator is determined implicitly based on track expression. See also 'iterator' section.
If 'TRUE' references are preserved in the iterator.
Iterator filter.
An array that represent quantiles.
This function calculates quantiles for the given percentiles.
If data size exceeds the limit (see: 'getOption(emr_max.data.size)'), the data is randomly sampled to fit the limit. A warning message is generated then.
There are a few types of iterators:
Track iterator: Track iterator returns the points (including the reference) from the specified track. Track name is specified as a string. If `keepref=FALSE` the reference of each point is set to `-1`
Example:
# Returns the level of glucose one hour after the insulin shot was made
emr_vtrack.create("glucose", "glucose_track", func="avg", time.shift=1)
emr_extract("glucose", iterator="insulin_shot_track")
Id-Time Points Iterator: Id-Time points iterator generates points from an *id-time points table*. If `keepref=FALSE` the reference of each point is set to `-1`.
Example:
# Returns the level of glucose one hour after the insulin shot was made
emr_vtrack.create("glucose", "glucose_track", func = "avg", time.shift = 1)
r <- emr_extract("insulin_shot_track") # <-- implicit iterator is used here
emr_extract("glucose", iterator = r)
Ids Iterator: Ids iterator generates points with ids taken from an *ids table* and times that run from `stime` to `etime` with a step of 1. If `keepref=TRUE` for each id-time pair the iterator generates 255 points with references running from `0` to `254`. If `keepref=FALSE` only one point is generated for the given id and time, and its reference is set to `-1`.
Example:
stime <- emr_date2time(1, 1, 2016, 0)
etime <- emr_date2time(31, 12, 2016, 23)
emr_extract("glucose", iterator = data.frame(id = c(2, 5)), stime = stime, etime = etime)
Time Intervals Iterator: *Time intervals iterator* generates points for all the ids that appear in 'patients.dob' track with times taken from a *time intervals table* (see: Appendix). Each time starts at the beginning of the time interval and runs to the end of it with a step of 1. That being said the points that lie outside of `[stime, etime]` range are skipped.
If `keepref=TRUE` for each id-time pair the iterator generates 255 points with references running from `0` to `254`. If `keepref=FALSE` only one point is generated for the given id and time, and its reference is set to `-1`.
Example:
# Returns the level of hangover for all patients the next day after New Year Eve for the years 2015 and 2016
stime1 <- emr_date2time(1, 1, 2015, 0)
etime1 <- emr_date2time(1, 1, 2015, 23)
stime2 <- emr_date2time(1, 1, 2016, 0)
etime2 <- emr_date2time(1, 1, 2016, 23)
emr_extract("alcohol_level_track", iterator = data.frame(
stime = c(stime1, stime2),
etime = c(etime1, etime2)
))
Id-Time Intervals Iterator: *Id-Time intervals iterator* generates for each id points that cover `['stime', 'etime']` time range as specified in *id-time intervals table* (see: Appendix). Each time starts at the beginning of the time interval and runs to the end of it with a step of 1. That being said the points that lie outside of `[stime, etime]` range are skipped.
If `keepref=TRUE` for each id-time pair the iterator generates 255 points with references running from `0` to `254`. If `keepref=FALSE` only one point is generated for the given id and time, and its reference is set to `-1`
Beat Iterator: *Beat Iterator* generates a "time beat" at the given period for each id that appear in 'patients.dob' track. The period is given always in hours.
Example:
emr_extract("glucose_track", iterator=10, stime=1000, etime=2000)
This will create a beat iterator with a period of 10 hours starting at `stime` up until `etime` is reached. If, for example, `stime` equals `1000` then the beat iterator will create for each id iterator points at times: 1000, 1010, 1020, ...
If `keepref=TRUE` for each id-time pair the iterator generates 255 points with references running from `0` to `254`. If `keepref=FALSE` only one point is generated for the given id and time, and its reference is set to `-1`.
Extended Beat Iterator: *Extended beat iterator* is as its name suggests a variation on the beat iterator. It works by the same principle of creating time points with the given period however instead of basing the times count on `stime` it accepts an additional parameter - a track or a *Id-Time Points table* - that instructs what should be the initial time point for each of the ids. The two parameters (period and mapping) should come in a list. Each id is required to appear only once and if a certain id does not appear at all, it is skipped by the iterator.
Anyhow points that lie outside of `[stime, etime]` range are not generated.
Example:
# Returns the maximal weight of patients at one year span starting from their birthdays
emr_vtrack.create("weight", "weight_track", func = "max", time.shift = c(0, year()))
emr_extract("weight", iterator = list(year(), "birthday_track"), stime = 1000, etime = 2000)
Periodic Iterator: periodic iterator goes over every year/month. You can use it by running emr_monthly_iterator
or emr_yearly_iterator
.
Example:
iter <- emr_yearly_iterator(emr_date2time(1, 1, 2002), emr_date2time(1, 1, 2017))
emr_extract("dense_track", iterator = iter, stime = 1, etime = 3)
iter <- emr_monthly_iterator(emr_date2time(1, 1, 2002), n = 15)
emr_extract("dense_track", iterator = iter, stime = 1, etime = 3)
Implicit Iterator: The iterator is set implicitly if its value remains `NULL` (which is the default). In that case the track expression is analyzed and searched for track names. If all the track variables or virtual track variables point to the same track, this track is used as a source for a track iterator. If more then one track appears in the track expression, an error message is printed out notifying ambiguity.
Revealing Current Iterator Time:
During the evaluation of a track expression one can access a specially defined variable named `EMR_TIME` (Python: `TIME`). This variable contains a vector (`numpy.ndarray` in Python) of current iterator times. The length of the vector matches the length of the track variable (which is a vector too).
Note that some values in `EMR_TIME` might be set 0. Skip those intervals and the values of the track variables at the corresponding indices.
# Returns times of the current iterator as a day of month
emr_extract("emr_time2dayofmonth(EMR_TIME)", iterator = "sparse_track")
emr_db.init_examples()
#> NULL
emr_quantiles("sparse_track", c(0.1, 0.6, 0.8))
#> 0.1 0.6 0.8
#> 13.0 90.8 168.0