Copies

DataAxesFormats.Copies Module

Copy data between Daf data sets.

Note

Copying into an in-memory data set does not duplicate the data; instead it just shares a reference to it. This is fast. In contrast, copying into a disk-based data set (e.g. using HDF5 or simple files) will create a duplicate of the data on disk. This is slow. However, both directions will not significantly increase the amount of memory allocated by the application.

DataAxesFormats.Copies.copy_scalar! Function
copy_scalar(;
    destination::DafWriter,
    source::DafReader,
    name::AbstractString,
    [rename::Maybe{AbstractString} = nothing,
    type::Maybe{Type{<:StorageScalarBase}} = nothing,
    default::Union{StorageScalar, Nothing, UndefInitializer} = undef,
    overwrite::Bool = false,
    insist::Bool = true]
)::Nothing

Copy a scalar with some name from some source DafReader into some destination DafWriter .

The scalar is fetched using the name and the default . If rename is specified, store the scalar using this new name. If type is specified, the data is converted to this type. If the scalar already exists in the target, if overwrite , it will be replaced; otherwise, if not insist , skip the copy; otherwise, fail.

DataAxesFormats.Copies.copy_axis! Function
copy_axis(;
    destination::DafWriter,
    source::DafReader,
    axis::AbstractString,
    [rename::Maybe{AbstractString} = nothing,
    default::Union{Nothing, UndefInitializer} = undef,
    overwrite::Bool = false,
    insist::Bool = true]
)::Nothing

Copy an axis from some source DafReader into some destination DafWriter .

The axis is fetched using the name and the default . If rename is specified, store the axis using this name.

If the axis already exists in the target, if overwrite , it will be replaced (erasing all data for that axis); otherwise, if not insist , skip the copy; otherwise, fail.

DataAxesFormats.Copies.copy_vector! Function
copy_vector!(;
    destination::DafWriter,
    source::DafReader,
    axis::AbstractString,
    name::AbstractString,
    [reaxis::Maybe{AbstractString} = nothing,
    rename::Maybe{AbstractString} = nothing,
    type::Maybe{Type{<:StorageScalarBase}} = nothing,
    default::Union{StorageScalar, StorageVector, Nothing, UndefInitializer} = undef,
    empty::Maybe{StorageScalar} = nothing,
    bestify::Bool = false,
    min_sparse_saving_fraction::AbstractFloat = ```0.25```,
    overwrite::Bool = false,
    insist::Bool = true]
)::Nothing

Copy a vector from some source DafReader into some destination DafWriter .

The vector is fetched using the axis , name and the default . If reaxis is specified, store the vector using this axis. If rename is specified, store the vector using this name. If type is specified, the data is converted to this type. If the vector already exists in the target, if overwrite , it will be replaced; otherwise, if not insist , skip the copy; otherwise, fail.

If bestify is set, then bestify the data before writing it, using min_sparse_saving_fraction .

This requires the axis of one data set is the same, or is a superset of, or a subset of, the other. If the target axis contains entries that do not exist in the source, then empty must be specified to fill the missing values. If the source axis contains entries that do not exist in the target, they are discarded (not copied).

DataAxesFormats.Copies.copy_matrix! Function
copy_matrix(;
    destination::DafWriter,
    source::DafReader,
    rows_axis::AbstractString,
    columns_axis::AbstractString,
    name::AbstractString,
    [rows_reaxis::Maybe{AbstractString} = nothing,
    columns_reaxis::Maybe{AbstractString} = nothing,
    rename::Maybe{AbstractString} = nothing,
    eltype::Maybe{Type{<:StorageScalarBase}} = nothing,
    default::Union{StorageScalar, StorageVector, Nothing, UndefInitializer} = undef,
    empty::Maybe{StorageScalar} = nothing,
    bestify::Bool = false,
    min_sparse_saving_fraction::AbstractFloat = ```0.25```,
    relayout::Bool = true,
    overwrite::Bool = false,
    insist::Bool = true]
)::Nothing

Copy a matrix from some source DafReader into some destination DafWriter .

The matrix is fetched using the rows_axis , columns_axis , name , relayout and the default . If rows_reaxis and/or columns_reaxis are specified, store the vector using these axes. If rename is specified, store the matrix using this name. If eltype is specified, the data is converted to this type. If the matrix already exists in the target, if overwrite , it will be replaced; otherwise, if not insist , skip the copy; otherwise, fail.

If bestify is set, then bestify the data before writing it, using min_sparse_saving_fraction .

This requires each axis of one data set is the same, or is a superset of, or a subset of, the other. If a target axis contains entries that do not exist in the source, then empty must be specified to fill the missing values. If a source axis contains entries that do not exist in the target, they are discarded (not copied).

Note

When copying a matrix from a subset to a superset, if the empty value is zero, then we create a sparse matrix in the destination. However, currently we create a temporary dense matrix for this; this is inefficient and should be replaced by a more efficient method.

DataAxesFormats.Copies.copy_tensor! Function
copy_tensor(;
    destination::DafWriter,
    source::DafReader,
    main_axis::AbstractString,
    rows_axis::AbstractString,
    columns_axis::AbstractString,
    name::AbstractString,
    [rows_reaxis::Maybe{AbstractString} = nothing,
    columns_reaxis::Maybe{AbstractString} = nothing,
    rename::Maybe{AbstractString} = nothing,
    eltype::Maybe{Type{<:StorageScalarBase}} = nothing,
    empty::Maybe{StorageScalar} = nothing,
    bestify::Bool = false,
    min_sparse_saving_fraction::AbstractFloat = ```0.25```,
    relayout::Bool = true,
    overwrite::Bool = false,
    insist::Bool = true]
)::Nothing

Copy a tensor from some source DafReader into some destination DafWriter .

If bestify is set, then bestify the data before writing it, using min_sparse_saving_fraction .

This is basically a loop that calls copy_matrix! for each of the tensor matrices, based on the entries of the main_axis in the destination . This will create an matrix full of the empty value for any entries of the main axis which exist in the destination but do not exist in the source. If a tensor matrix already exists in the target, if overwrite , it will be replaced; otherwise, if not insist , skip the copy; otherwise, fail.

DataAxesFormats.Copies.copy_all! Function
copy_all!(;
    destination::DafWriter,
    source::DafReader
    [empty::Maybe{EmptyData} = nothing,
    types::Maybe{DataTypes} = nothing,
    overwrite::Bool = false,
    insist::Bool = true,
    relayout::Bool = true]
)::Nothing

Copy all the content of a source DafReader into a destination DafWriter . If some data already exists in the target, if overwrite , it will be replaced; otherwise, if not insist , skip the copy; otherwise, fail.

This will create target axes that exist in only in the source, but will not overwrite existing target axes, regardless of the value of overwrite . An axis that exists in the target must be identical to, or be a subset of, the same axis in the source.

If the source has axes which are a subset of the same axes in the target, then you must specify a dictionary of values for the empty entries that will be created in the target when copying any vector and/or matrix properties. This is specified using a (axis, property) => value entry for specifying an empty value for a vector property and a (rows_axis, columns_axis, property) => entry for specifying an empty value for a matrix property. The order of the axes for matrix properties doesn't matter (the same empty value is automatically used for both axes orders).

If types are specified, the copied data of the matching property is converted to the specified data type.

If a TensorKey is specified, this will create an matrix full of the empty value for any entries of the main axis which exist in the destination but do not exist in the source.

DataAxesFormats.Copies.EmptyData Type

Specify the data to use for missing properties in a Daf data set. This is a dictionary with an DataKey specifying for which property we spec,aify a value to, and the value to use. We would have liked to specify this as AbstractDict{<:DataKey, <:StorageScalarBase} but Julia in its infinite wisdom considers Dict(["a" => "b", ("c", "d") => 1]) to be a Dict{Any, Any} , which would require literals to be annotated with the type.

Note

A TensorKey is interpreted as if it as the set of MatrixKey s that are included in the tensor. These are expanded in an internal copy of the dictionary and will override any other specified MatrixKey .

DataAxesFormats.Copies.DataTypes Type

Specify the data type to use for overriding properties types in a Daf data set. This is a dictionary with an DataKey specifying for which property we specify a value to, and the data type to use. We would have liked to specify this as AbstractDict{<:DataKey, Type{<:StorageScalarBase}} but Julia in its infinite wisdom considers Dict(["a" => Bool, ("c", "d") => Int32]) to be a Dict{Any, DataType} , which would require literals to be annotated with the type.

Note

A TensorKey is interpreted as if it as the set of MatrixKey s that are included in the tensor. These are expanded in an internal copy of the dictionary and will override any other specified MatrixKey .

Index