Query operations

DataAxesFormats.Operations

— Module

A Daf query can use operations to process the data: EltwiseOperation s that preserve the shape of the data, and ReductionOperation s that reduce a matrix to a vector, or a vector to a scalar.

Element-wise operations

DataAxesFormats.Operations.Abs

— Type

Abs([; type::Maybe{Type} = nothing])

Element-wise operation that converts every element to its absolute value.

Parameters

type - The default output data type is the unsigned_type_for the input data type.

DataAxesFormats.Operations.Clamp

— Type

Clamp([; min::Maybe{StorageReal} = nothing, max::Maybe{StorageReal} = nothing])

Element-wise operation that converts every element to a value inside a range.

Parameters

min - If specified, values lower than this will be increased to this value.

max - If specified, values higher than this will be increased to this value.

Note

At least one of min and max must be specified.

DataAxesFormats.Operations.Convert

— Type

Convert([; type::Type])

Element-wise operation that converts every element to a given data type.

Parameters

type - The data type to convert to. There's no default.

DataAxesFormats.Operations.Fraction

— Type

Fraction([; type::Type])

Element-wise operation that converts every element to its fraction out of the total. If the total is zero, all the fractions are also set to zero. This implicitly assumes (but does not enforce) that all the entry value(s) are positive.

For matrices, each entry becomes its fraction out of the total of the column it belongs to. For vectors, each entry becomes its fraction out of the total of the vector. For scalars, this operation makes no sense so fails with an error.

Parameters

type - The default output data type is the float_type_for of the input data type.

DataAxesFormats.Operations.Log

— Type

Log(; type::Maybe{Type} = nothing, base::StorageReal = e, eps::StorageReal = 0)

Element-wise operation that converts every element to its logarithm.

Parameters :

type - The default output data type is the float_type_for of the input data type.

base - The base of the logarithm. By default uses e (that is, computes the natural logarithm), which isn't convenient, but is the standard.

eps - Added to the input before computing the logarithm, to handle zero input data. By default is zero.

DataAxesFormats.Operations.Round

— Type

Round([; type::Maybe{Type} = nothing])

Element-wise operation that converts every element to the nearest integer value.

Parameters

type - By default, uses the int_type_for the input data type.

DataAxesFormats.Operations.Significant

— Type

Significant(; high::StorageReal, low::Maybe{StorageReal} = nothing)

Element-wise operation that zeros all "insignificant" values. Significant values have a high absolute value. This is typically used to prune matrices of effect sizes (log of ratio between a baseline and some result) for heatmap display. For example, log base 2 of gene expression ratio is typically considered significant if it is at least 3 (that is, a ratio at least 8x or at most 1/8x); for genes that have a significant effect, we typically display all entries with a log of at least 2 (that is, a ratio of at least 4x or at most 1/4x).

For scalars, this operation makes no sense so fails with an error.

Parameters :

high - A value is considered significant if its absolute value is higher than this. If all values in a vector (or a matrix column) are less than this, then all the vector (or matrix column) entries are zeroed. There's no default.

low - If there is at least one significant value in a vector (or a matrix column), then zero all entries that are lower than this. By default, this is the same as the high value. Setting it to a lower value will preserve more entries, but only for vectors (or matrix columns) which contain at least some significant data.

Reduction operations

DataAxesFormats.Operations.Sum

— Type

Sum(; type::Maybe{Type} = nothing)

Reduction operation that sums elements.

Parameters

type - By default, uses the sum_type_for the input data type.

DataAxesFormats.Operations.Max

— Type

Max()

Reduction operation that returns the maximal element.

DataAxesFormats.Operations.Min

— Type

Min()

Reduction operation that returns the minimal element.

DataAxesFormats.Operations.Median

— Type

Median(; type::Maybe{Type} = nothing)

Reduction operation that returns the median value.

Parameters

type - The default output data type is the float_type_for of the input data type.

DataAxesFormats.Operations.Quantile

— Type

Quantile(; type::Maybe{Type} = nothing, p::StorageReal)

Reduction operation that returns the quantile value, that is, a value such that a certain fraction of the values is lower.

Parameters

type - The default output data type is the float_type_for of the input data type.

p - The fraction of values below the result (e.g., the 0 computes the minimum, the 0.5 computes the median, and 1.0 computes the maximum). There's no default.

DataAxesFormats.Operations.Mean

— Type

Mean(; type::Maybe{Type} = nothing)

Reduction operation that returns the mean value.

Parameters

type - The default output data type is the float_type_for of the input data type.

DataAxesFormats.Operations.GeoMean

— Type

GeoMean(; type::Maybe{Type} = nothing, eps::StorageReal = 0.0)

Reduction operation that returns the geometric mean value.

Parameters

type - The default output data type is the float_type_for of the input data type.

eps - The regularization factor added to each value and subtracted from the raw geo-mean, to deal with zero values.

DataAxesFormats.Operations.Std

— Type

Std(; type::Maybe{Type} = nothing)

Reduction operation that returns the (uncorrected) standard deviation of the values.

Parameters

type - The default output data type is the float_type_for of the input data type.

DataAxesFormats.Operations.StdN

— Type

StdN(; type::Maybe{Type} = nothing, eps::StorageReal = 0)

Reduction operation that returns the (uncorrected) standard deviation of the values, normalized (divided) by the mean value.

Parameters

type - The default output data type is the float_type_for of the input data type.

eps - Added to the mean before computing the division, to handle zero input data. By default is zero.

DataAxesFormats.Operations.Var

— Type

Var(; type::Maybe{Type} = nothing)

Reduction operation that returns the (uncorrected) variance of the values.

Parameters

type - The default output data type is the float_type_for of the input data type.

DataAxesFormats.Operations.VarN

— Type

VarN(; type::Maybe{Type} = nothing, eps::StorageReal = 0.0)

Reduction operation that returns the (uncorrected) variance of the values, normalized (divided) by the mean of the values.

Parameters

type - The default output data type is the float_type_for of the input data type.

eps - Added to the mean before computing the division, to handle zero input data. By default is zero.

DataAxesFormats.Operations.Mode

— Type

Mode()

Reduction operation that returns the most frequent value in the input (the "mode").

Note

This operation supports strings; most operations do not.

DataAxesFormats.Operations.Count

— Type

Count(; type::Maybe{Type} = nothing)

Reduction operation that counts elements. This is useful when using GroupBy queries to count the number of elements in each group.

Note

This operation supports strings; most operations do not.

Parameters

type - By default, uses UInt32.

Support functions

DataAxesFormats.Operations.parse_parameter_value

— Function

parse_parameter_value(
    parse_value::Function,
    operation_name::Token,
    operation_kind::AbstractString,
    parameters_values::Dict{String, Token},
    parameter_name::AbstractString,
    default::Any,
)::Any

Parse an operation parameter.

DataAxesFormats.Operations.parse_number_value

— Function

parse_number_value(
    operation_name::AbstractString,
    parameter_name::AbstractString,
    parameter_value::Token,
    type::Type{T},
)::T where {T <: StorageReal}

Parse a numeric operation parameter.

DataAxesFormats.Operations.parse_number_type_value

— Function

parse_number_type_value(
    operation_name::AbstractString,
    parameter_name::AbstractString,
    parameter_value::Token,
)::Maybe{Type}

Parse the type operation parameter.

Valid names are {B,b}ool, {UI,ui,I,i}nt{8,16,32,64} and {F,f}loat{32,64}.

DataAxesFormats.Operations.parse_float_type_value

— Function

parse_float_type_value(
    operation_name::AbstractString,
    parameter_name::AbstractString,
    parameter_value::Token,
)::Maybe{Type}

Similar to parse_number_type_value , but only accept floating point types.

DataAxesFormats.Operations.parse_int_type_value

— Function

parse_int_type_value(
    operation_name::AbstractString,
    parameter_name::AbstractString,
    parameter_value::Token,
)::Maybe{Type}

Similar to parse_number_type_value , but only accept integer (signed or unsigned) types.

DataAxesFormats.Operations.error_invalid_parameter_value

— Function

error_invalid_parameter_value(
    operation_name::Token,
    parameter_name::AbstractString,
    parameter_value::Token,
    must_be::AbstractString,
)::Nothing

Complain that an operation parameter value is not valid.

DataAxesFormats.Operations.float_type_for

— Function

float_type_for(
    element_type::Type{<:StorageReal},
    type::Maybe{Type{<:StorageReal}}
)::Type{<:AbstractFloat}

Given an input element_type, return the data type to use for the result of an operation that always produces floating point values (e.g., Log ). If type isn't nothing, it is returned instead.

DataAxesFormats.Operations.int_type_for

— Function

int_type_for(
    element_type::Type{<:StorageReal},
    type::Maybe{Type{<:StorageReal}}
)::Type{<:Integer}

Given an input element_type, return the data type to use for the result of an operation that always produces integer values (e.g., Round ). If type isn't nothing, it is returned instead.

DataAxesFormats.Operations.unsigned_type_for

— Function

unsigned_type_for(
    element_type::Type{<:StorageReal},
    type::Maybe{Type{<:StorageReal}}
)::Type

Given an input element_type, return the data type to use for the result of an operation that discards the sign of the value (e.g., Abs ). If type isn't nothing, it is returned instead.

DataAxesFormats.Operations.sum_type_for

— Function

sum_type_for(
    element_type::Type{<:StorageReal},
    type::Maybe{Type{<:StorageReal}}
)::Type{<:StorageReal}

Given an input element_type, return the data type to use for the result of an operation that sums many such values values (e.g., Sum ). If type isn't nothing, it is returned instead.

This keeps floating point and 64-bit types as-is, but increases any small integer types to the matching 32 bit type (e.g., an input type of UInt8 will have a sum type of UInt32).

Query operations

Element-wise operations

Reduction operations

Support functions

Index