Writers
DataAxesFormats.Writers
—
Module
The
DafWriter
interface specify a high-level API for writing
Daf
data. This API is implemented here, on top of the low-level
FormatWriter
API. This is an extension of the
DafReader
API and provides provides thread safety for reading and writing to the same data set from multiple threads, so the low-level API can (mostly) ignore this issue.
Scalar properties
DataAxesFormats.Writers.set_scalar!
—
Function
set_scalar!(
daf::DafWriter,
name::AbstractString,
value::StorageScalar;
[overwrite::Bool = false]
)::Nothing
Set the
value
of a scalar property with some
name
in
daf
.
If not
overwrite
(the default), this first verifies the
name
scalar property does not exist.
cells = example_cells_daf()
set_scalar!(cells, "version", 1.0)
println(get_scalar(cells, "version"))
set_scalar!(cells, "version", 2.0; overwrite = true)
println(get_scalar(cells, "version"))
# output
1.0
2.0
DataAxesFormats.Writers.delete_scalar!
—
Function
delete_scalar!(
daf::DafWriter,
name::AbstractString;
must_exist::Bool = true,
)::Nothing
Delete a scalar property with some
name
from
daf
.
If
must_exist
(the default), this first verifies the
name
scalar property exists in
daf
.
cells = example_cells_daf()
println(has_scalar(cells, "organism"))
delete_scalar!(cells, "organism")
println(has_scalar(cells, "organism"))
# output
true
false
Writers axes
DataAxesFormats.Writers.add_axis!
—
Function
add_axis!(
daf::DafWriter,
axis::AbstractString,
entries::AbstractVector{<:AbstractString};
overwrite::Bool = false,
)::Nothing
Add a new
axis
to
daf
.
This verifies the
entries
are unique. If
overwrite
, this will first delete an existing axis with the same name (which will also delete any data associated with this axis!). Otherwise, this verifies the the
axis
does not exist.
metacells = example_cells_daf()
println(has_axis(metacells, "block"))
add_axis!(metacells, "block", ["B1", "B2"])
println(has_axis(metacells, "block"))
# output
false
true
DataAxesFormats.Writers.delete_axis!
—
Function
delete_axis!(
daf::DafWriter,
axis::AbstractString;
must_exist::Bool = true,
)::Nothing
Delete an
axis
from the
daf
. This will also delete any vector or matrix properties that are based on this axis.
If
must_exist
(the default), this first verifies the
axis
exists in the
daf
.
metacells = example_metacells_daf()
println(has_axis(metacells, "type"))
delete_axis!(metacells, "type")
println(has_axis(metacells, "type"))
# output
true
false
Vector properties
DataAxesFormats.Writers.set_vector!
—
Function
set_vector!(
daf::DafWriter,
axis::AbstractString,
name::AbstractString,
vector::Union{StorageScalar, StorageVector};
[eltype::Maybe{Type{<:StorageReal}} = nothing,
overwrite::Bool = false]
)::Nothing
Set a vector property with some
name
for some
axis
in
daf
.
If the
vector
specified is actually a
StorageScalar
, the stored vector is filled with this value.
This first verifies the
axis
exists in
daf
, that the property name isn't
name
, and that the
vector
has the appropriate length. If not
overwrite
(the default), this also verifies the
name
vector does not exist for the
axis
.
If
eltype
is specified, and the data is of another type, then the data is converted to this data type before being stored.
metacells = example_metacells_daf()
println(has_vector(metacells, "type", "is_mebemp"))
set_vector!(metacells, "type", "is_mebemp", [true, true, false, false])
println(has_vector(metacells, "type", "is_mebemp"))
set_vector!(metacells, "type", "is_mebemp", [true, true, true, false]; overwrite = true)
println(has_vector(metacells, "type", "is_mebemp"))
# output
false
true
true
DataAxesFormats.Writers.delete_vector!
—
Function
delete_vector!(
daf::DafWriter,
axis::AbstractString,
name::AbstractString;
must_exist::Bool = true,
)::Nothing
Delete a vector property with some
name
for some
axis
from
daf
.
This first verifies the
axis
exists in
daf
and that the property name isn't
name
. If
must_exist
(the default), this also verifies the
name
vector exists for the
axis
.
metacells = example_metacells_daf()
println(has_vector(metacells, "type", "color"))
delete_vector!(metacells, "type", "color")
println(has_vector(metacells, "type", "color"))
# output
true
false
Matrix properties
DataAxesFormats.Writers.set_matrix!
—
Function
set_matrix!(
daf::DafWriter,
rows_axis::AbstractString,
columns_axis::AbstractString,
name::AbstractString,
matrix::Union{StorageScalarBase, StorageMatrix};
[eltype::Maybe{Type{<:StorageScalarBase}} = nothing,
overwrite::Bool = false,
relayout::Bool = true]
)::Nothing
Set the matrix property with some
name
for some
rows_axis
and
columns_axis
in
daf
. Since this is Julia, this should be a column-major
matrix
.
If the
matrix
specified is actually a
StorageScalar
, the stored matrix is filled with this value.
If
relayout
(the default), this will also automatically
relayout!
the matrix and store the result, so the data would also be stored in row-major layout (that is, with the axes flipped), similarly to calling
relayout!
.
This first verifies the
rows_axis
and
columns_axis
exist in
daf
, that the
matrix
is column-major of the appropriate size. If not
overwrite
(the default), this also verifies the
name
matrix does not exist for the
rows_axis
and
columns_axis
.
metacells = example_metacells_daf()
println(has_matrix(metacells, "gene", "metacell", "confidence"))
println(has_matrix(metacells, "gene", "metacell", "confidence"; relayout = false))
println(has_matrix(metacells, "metacell", "gene", "confidence"; relayout = false))
set_matrix!(metacells, "metacell", "gene", "confidence", rand(7, 683); relayout = false)
println()
println(has_matrix(metacells, "gene", "metacell", "confidence"))
println(has_matrix(metacells, "gene", "metacell", "confidence"; relayout = false))
println(has_matrix(metacells, "metacell", "gene", "confidence"; relayout = false))
set_matrix!(metacells, "metacell", "gene", "confidence", rand(7, 683); overwrite = true)
println()
println(has_matrix(metacells, "gene", "metacell", "confidence"))
println(has_matrix(metacells, "gene", "metacell", "confidence"; relayout = false))
println(has_matrix(metacells, "metacell", "gene", "confidence"; relayout = false))
# output
false
false
false
true
false
true
true
true
true
DataAxesFormats.Writers.relayout_matrix!
—
Function
relayout_matrix!(
daf::DafWriter,
rows_axis::AbstractString,
columns_axis::AbstractString,
name::AbstractString;
[overwrite::Bool = false]
)::Nothing
Given a matrix property with some
name
exists (in column-major layout) in
daf
for the
rows_axis
and the
columns_axis
, then
relayout!
it and store the row-major result as well (that is, with flipped axes).
This is useful following calling
empty_dense_matrix!
or
empty_sparse_matrix!
to ensure both layouts of the matrix are stored in
def
. When calling
set_matrix!
, it is simpler to just specify (the default)
relayout = true
.
This first verifies the
rows_axis
and
columns_axis
exist in
daf
, and that there is a
name
(column-major) matrix property for them. If not
overwrite
(the default), this also verifies the
name
matrix does not exist for the
flipped
rows_axis
and
columns_axis
.
A restriction of the way
Daf
stores data is that square data is only stored in one (column-major) layout (e.g., to store a weighted directed graph between cells, you may store an outgoing
weights matrix where each cell's column holds the outgoing weights from the cell to the other cells. In this case you
can't
ask
Daf
to relayout the matrix to row-major order so that each cell's row would be the incoming weights from the other cells. Instead you would need to explicitly store a separate incoming
weights matrix where each cell's column holds the incoming weights).
DataAxesFormats.Writers.delete_matrix!
—
Function
delete_matrix!(
daf::DafWriter,
rows_axis::AbstractString,
columns_axis::AbstractString,
name::AbstractString;
[must_exist::Bool = true,
relayout::Bool = true]
)::Nothing
Delete a matrix property with some
name
for some
rows_axis
and
columns_axis
from
daf
.
If
relayout
(the default), this will also delete the matrix in the other layout (that is, with flipped axes).
This first verifies the
rows_axis
and
columns_axis
exist in
daf
. If
must_exist
(the default), this also verifies the
name
matrix exists for the
rows_axis
and
columns_axis
.
cells = example_cells_daf()
println(has_matrix(cells, "gene", "cell", "UMIs"))
println(has_matrix(cells, "gene", "cell", "UMIs"; relayout = false))
println(has_matrix(cells, "cell", "gene", "UMIs"; relayout = false))
delete_matrix!(cells, "gene", "cell", "UMIs"; relayout = false)
println()
println(has_matrix(cells, "gene", "cell", "UMIs"))
println(has_matrix(cells, "gene", "cell", "UMIs"; relayout = false))
println(has_matrix(cells, "cell", "gene", "UMIs"; relayout = false))
delete_matrix!(cells, "gene", "cell", "UMIs"; must_exist = false)
println()
println(has_matrix(cells, "gene", "cell", "UMIs"))
println(has_matrix(cells, "gene", "cell", "UMIs"; relayout = false))
println(has_matrix(cells, "cell", "gene", "UMIs"; relayout = false))
# output
true
true
true
true
false
true
false
false
false
Creating properties
DataAxesFormats.Writers.empty_dense_vector!
—
Function
empty_dense_vector!(
fill::Function,
daf::DafWriter,
axis::AbstractString,
name::AbstractString,
eltype::Type{<:StorageReal};
[overwrite::Bool = false]
)::Any
Create an empty dense vector property with some
name
for some
axis
in
daf
, pass it to
fill
, and return the result.
The returned vector will be uninitialized; the caller is expected to
fill
it with values. This saves creating a copy of the vector before setting it in the data, which makes a huge difference when creating vectors on disk (using memory mapping). For this reason, this does not work for strings, as they do not have a fixed size.
This first verifies the
axis
exists in
daf
and that the property name isn't
name
. If not
overwrite
(the default), this also verifies the
name
vector does not exist for the
axis
.
DataAxesFormats.Writers.empty_sparse_vector!
—
Function
empty_sparse_vector!(
fill::Function,
daf::DafWriter,
axis::AbstractString,
name::AbstractString,
eltype::Type{<:StorageReal},
nnz::StorageInteger,
indtype::Maybe{Type{<:StorageInteger}} = nothing;
[overwrite::Bool = false]
)::Any
Create an empty sparse vector property with some
name
for some
axis
in
daf
, pass its parts (
nzind
and
nzval
) to
fill
, and return the result.
If
indtype
is not specified, it is chosen automatically to be the smallest unsigned integer type needed for the vector.
The returned vector will be uninitialized; the caller is expected to
fill
its
nzind
and
nzval
vectors with values. Specifying the
nnz
makes their sizes known in advance, to allow pre-allocating disk data. For this reason, this does not work for strings, as they do not have a fixed size.
This severely restricts the usefulness of this function, because typically
nnz
is only know after fully computing the matrix. Still, in some cases a large sparse vector is created by concatenating several smaller ones; this function allows doing so directly into the data vector, avoiding a copy in case of memory-mapped disk formats.
It is the caller's responsibility to fill the two vectors with valid data. Specifically, you must ensure:
-
nzind[1] == 1 -
nzind[i] <= nzind[i + 1] -
nzind[end] == nnz
This first verifies the
axis
exists in
daf
and that the property name isn't
name
. If not
overwrite
(the default), this also verifies the
name
vector does not exist for the
axis
.
DataAxesFormats.Writers.empty_dense_matrix!
—
Function
empty_dense_matrix!(
fill::Function,
daf::DafWriter,
rows_axis::AbstractString,
columns_axis::AbstractString,
name::AbstractString,
eltype::Type{<:StorageReal};
[overwrite::Bool = false]
)::Any
Create an empty dense matrix property with some
name
for some
rows_axis
and
columns_axis
in
daf
, pass it to
fill
, and return the result. Since this is Julia, this will be a column-major
matrix
.
The returned matrix will be uninitialized; the caller is expected to
fill
it with values. This saves creating a copy of the matrix before setting it in
daf
, which makes a huge difference when creating matrices on disk (using memory mapping). For this reason, this does not work for strings, as they do not have a fixed size.
This first verifies the
rows_axis
and
columns_axis
exist in
daf
, that the
matrix
is column-major of the appropriate size. If not
overwrite
(the default), this also verifies the
name
matrix does not exist for the
rows_axis
and
columns_axis
.
DataAxesFormats.Writers.empty_sparse_matrix!
—
Function
empty_sparse_matrix!(
fill::Function,
daf::DafWriter,
rows_axis::AbstractString,
columns_axis::AbstractString,
name::AbstractString,
eltype::Type{<:StorageReal},
nnz::StorageInteger,
intdype::Maybe{Type{<:StorageInteger}} = nothing;
[overwrite::Bool = false]
)::Any
Create an empty sparse matrix property with some
name
for some
rows_axis
and
columns_axis
in
daf
, pass its parts (
colptr
,
rowval
and
nzval
) to
fill
, and return the result.
If
indtype
is not specified, it is chosen automatically to be the smallest unsigned integer type needed for the matrix.
The returned matrix will be uninitialized; the caller is expected to
fill
its
colptr
,
rowval
and
nzval
vectors. Specifying the
nnz
makes their sizes known in advance, to allow pre-allocating disk space. For this reason, this does not work for strings, as they do not have a fixed size.
This severely restricts the usefulness of this function, because typically
nnz
is only know after fully computing the matrix. Still, in some cases a large sparse matrix is created by concatenating several smaller ones; this function allows doing so directly into the data, avoiding a copy in case of memory-mapped disk formats.
It is the caller's responsibility to fill the three vectors with valid data. Specifically, you must ensure:
-
colptr[1] == 1 -
colptr[end] == nnz + 1 -
colptr[i] <= colptr[i + 1] - for all
j, for allisuch thatcolptr[j] <= iandi + 1 < colptr[j + 1],1 <= rowptr[i] < rowptr[i + 1] <= nrows
This first verifies the
rows_axis
and
columns_axis
exist in
daf
. If not
overwrite
(the default), this also verifies the
name
matrix does not exist for the
rows_axis
and
columns_axis
.
Index
-
DataAxesFormats.Writers -
DataAxesFormats.Writers.add_axis! -
DataAxesFormats.Writers.delete_axis! -
DataAxesFormats.Writers.delete_matrix! -
DataAxesFormats.Writers.delete_scalar! -
DataAxesFormats.Writers.delete_vector! -
DataAxesFormats.Writers.empty_dense_matrix! -
DataAxesFormats.Writers.empty_dense_vector! -
DataAxesFormats.Writers.empty_sparse_matrix! -
DataAxesFormats.Writers.empty_sparse_vector! -
DataAxesFormats.Writers.relayout_matrix! -
DataAxesFormats.Writers.set_matrix! -
DataAxesFormats.Writers.set_scalar! -
DataAxesFormats.Writers.set_vector!