Matrix Formats
TanayLabUtilities.MatrixFormats
—
Module
Deal with (some) of the matrix formats. This obviously can't be compherensive but it should cover the matrix types we have encountered so far and hopefully falls back to reasonable defaults for more exotic matrix types.
In Julia, many array types are wrappers around "parent" arrays. The specific wrappers we deal with in most cases are
NamedArray
which adds names to the rows and/or columns,
PermutedDimsArray
which flips the order of the axes,
Transpose
and
Adjoint
which likewise flip the axes (
Adjoint
also transforms complex values), and
ReadOnlyArray
which prevents mutating the array. And then there are more transformative wrappers such as
SubArray
,
SparseVector
and
SparseMatrixCSC
,
PyArray
, etc.
This makes life difficult. Specifically, you can't rely (much) on the type system to separate code dealing with different array types. For example, not all
issparse
arrays derive from
AbstractSparseArray
(because you might have a sparse array wrapped in something). It would have been great if there were
isdense
and
isstrided
functions to match and libraries actually used them to trigger optimized code but "that would have been too easy".
The code here tries to put this under some control so we can write robust code which "does the right thing", in most cases, at least when it comes to converting between formats. This means we are forced to provide alternatives to some built-in functions (for example, copying arrays). Sigh.
TanayLabUtilities.MatrixFormats.copy_array
—
Function
copy_array(array::AbstractArray; eltype::Maybe{Type} = nothing, indtype::Maybe{Type} = nothing)::AbstractArray
Create a copy of an array. This differs from
Base.copy
in the following:
-
Copying a read-only array returns a mutable array. In contrast, both
Base.copyandBase.deepcopyof aReadOnlyArrayarray will return aReadOnlyArrayarray, which is technically correct, but is rather pointless. -
Copying a
NamedArrayreturns aNamedArraythat shares the names (but not the data storage). -
Copying will preserve the layout of the data; for example, copying a
Transposearray is still aTransposearray. In contrast, whileBase.deepcopywill preserve the layout,Base.copywill silentlyrelayoutthe matrix, which is both expensive and unexpected. -
Copying a sparse vector or matrix gives a sparse result. Copying anything else gives a simple dense array regardless of the original type. This is done because a
deepcopyofPyArraywill still share the underlying buffer, which removes the whole point of doing a copy. Sigh. -
Copying a vector of anything derived from
AbstractStringreturns a vector ofAbstractString. -
You can override the
eltypeof the array (and/or theindtype, if it is sparse).
base = [0 1 2; 3 4 0]
# Dense
@assert brief(base) == "2 x 3 x Int64 in Columns (Dense)"
@assert brief(copy_array(base)) == "2 x 3 x Int64 in Columns (Dense)"
@assert copy_array(base) == base
@assert copy_array(base) !== base
@assert copy_array(base; eltype = Int32) == base
@assert brief(copy_array(base; eltype = Int32)) == "2 x 3 x Int32 in Columns (Dense)"
# Sparse
using SparseArrays
sparse = SparseMatrixCSC(base)
@assert copy_array(sparse) == sparse
@assert copy_array(sparse) !== sparse
@assert brief(sparse) == "2 x 3 x Int64 in Columns (Sparse 4 (67%) [Int64])"
@assert brief(copy_array(sparse)) == "2 x 3 x Int64 in Columns (Sparse 4 (67%) [Int64])"
@assert copy_array(sparse; eltype = Int32) == sparse
@assert brief(copy_array(sparse; eltype = Int32)) == "2 x 3 x Int32 in Columns (Sparse 4 (67%) [Int64])"
@assert copy_array(sparse; indtype = Int8) == sparse
@assert brief(copy_array(sparse; indtype = Int8)) == "2 x 3 x Int64 in Columns (Sparse 4 (67%) [Int8])"
# ReadOnly
read_only = read_only_array(base)
@assert brief(read_only) == "2 x 3 x Int64 in Columns (ReadOnly, Dense)"
@assert brief(copy_array(read_only)) == "2 x 3 x Int64 in Columns (Dense)"
@assert copy_array(read_only) == read_only
@assert copy_array(read_only) !== base
# Named
using NamedArrays
named = NamedArray(base)
@assert brief(named) == "2 x 3 x Int64 in Columns (Named, Dense)"
@assert brief(copy_array(named)) == "2 x 3 x Int64 in Columns (Named, Dense)"
@assert copy_array(named) == named
@assert parent(copy_array(named)) !== base
# Permuted
permuted = PermutedDimsArray(base, (2, 1))
@assert brief(permuted) == "3 x 2 x Int64 in Rows (Permute, Dense)"
@assert brief(copy_array(permuted)) == "3 x 2 x Int64 in Rows (Permute, Dense)"
@assert copy_array(permuted) == permuted
@assert parent(copy_array(permuted)) !== base
unpermuted = PermutedDimsArray(base, (1, 2))
@assert brief(unpermuted) == "2 x 3 x Int64 in Columns (!Permute, Dense)"
@assert brief(copy_array(unpermuted)) == "2 x 3 x Int64 in Columns (!Permute, Dense)"
@assert copy_array(unpermuted) == unpermuted
@assert parent(copy_array(unpermuted)) !== base
# LinearAlgebra
using LinearAlgebra
transposed = transpose(base)
@assert brief(transposed) == "3 x 2 x Int64 in Rows (Transpose, Dense)"
@assert brief(copy_array(transposed)) == "3 x 2 x Int64 in Rows (Transpose, Dense)"
@assert copy_array(transposed) == transposed
@assert parent(copy_array(transposed)) !== base
adjointed = adjoint(base)
@assert brief(adjointed) == "3 x 2 x Int64 in Rows (Adjoint, Dense)"
@assert brief(copy_array(adjointed)) == "3 x 2 x Int64 in Rows (Adjoint, Dense)"
@assert copy_array(adjointed) == adjointed
@assert parent(copy_array(adjointed)) !== base
# output
# Dense
base = [0, 1, 2]
@assert brief(base) == "3 x Int64 (Dense)"
@assert brief(copy_array(base)) == "3 x Int64 (Dense)"
@assert copy_array(base) == base
@assert copy_array(base) !== base
# Sparse
using SparseArrays
sparse = SparseVector(base)
@assert brief(sparse) == "3 x Int64 (Sparse 2 (67%) [Int64])"
@assert brief(copy_array(sparse)) == "3 x Int64 (Sparse 2 (67%) [Int64])"
@assert copy_array(sparse) == sparse
@assert copy_array(sparse) !== sparse
# ReadOnly
read_only = read_only_array(base)
@assert brief(read_only) == "3 x Int64 (ReadOnly, Dense)"
@assert brief(copy_array(read_only)) == "3 x Int64 (Dense)"
@assert copy_array(read_only) == read_only
@assert copy_array(read_only) !== base
# Named
using NamedArrays
named = NamedArray(base)
@assert brief(named) == "3 x Int64 (Named, Dense)"
@assert brief(copy_array(named)) == "3 x Int64 (Named, Dense)"
@assert copy_array(named) == named
@assert parent(copy_array(named)) !== base
# LinearAlgebra
using LinearAlgebra
transposed = transpose(base)
@assert brief(transposed) == "3 x Int64 (Transpose, Dense)"
@assert brief(copy_array(transposed)) == "3 x Int64 (Transpose, Dense)"
@assert copy_array(transposed) == transposed
@assert parent(copy_array(transposed)) !== base
adjointed = adjoint(base)
@assert brief(adjointed) == "3 x Int64 (Adjoint, Dense)"
@assert brief(copy_array(adjointed)) == "3 x Int64 (Adjoint, Dense)"
@assert copy_array(adjointed) == adjointed
@assert parent(copy_array(adjointed)) !== base
# String
base = split("abc", "")
@assert brief(base) == "3 x Str (Dense)"
@assert brief(copy_array(base)) == "3 x Str (Dense)"
@assert eltype(base) != AbstractString
@assert eltype(copy_array(base)) == AbstractString
@assert copy_array(base) == base
# output
TanayLabUtilities.MatrixFormats.similar_array
—
Function
similar_array(
array::AbstractArray;
[value::Any = undef,
eltype::Maybe{Type} = nothing,
default_major_axis::Maybe{Integer} = Columns]
)::AbstractArray
end
Return an
array
(vector or a matrix) similar to the given one. By default the data has the same
eltype
as the original, and is uninitialized unless you specify a
value
. The returned data is always dense (
Vector
or
Matrix
).
This is different from
similar
in that it will preserve the layout of a matrix (for example,
similar_array
of a
transpose
will also be a
transpose
). Also,
similar_array
of a
NamedArray
will be another
NamedArray
sharing the axes with the original, and
ReadOnlyArray
wrappers are stripped from the result. If the
array
is a matrix with no clear
major_axis
, such as a
@views
slice of a matrix, then the result will have the
default_major_axis
.
base = rand(3, 4)
@assert brief(base) == "3 x 4 x Float64 in Columns (Dense)"
@assert similar_array(base) !== base
@assert brief(similar_array(base)) == "3 x 4 x Float64 in Columns (Dense)"
@assert brief(similar_array(base; eltype = Int32)) == "3 x 4 x Int32 in Columns (Dense)"
@assert brief(similar_array(base; value = 0.0)) == "3 x 4 x Float64 in Columns (Dense)"
@assert all(similar_array(base; value = 0.0) .== 0)
# ReadOnly
read_only = read_only_array(base)
@assert brief(read_only) == "3 x 4 x Float64 in Columns (ReadOnly, Dense)"
@assert brief(similar_array(read_only)) == "3 x 4 x Float64 in Columns (Dense)"
# Named
using NamedArrays
named = NamedArray(base)
@assert brief(named) == "3 x 4 x Float64 in Columns (Named, Dense)"
@assert similar_array(named) !== named
@assert brief(similar_array(named)) == "3 x 4 x Float64 in Columns (Named, Dense)"
# Permuted
permuted = PermutedDimsArray(base, (2, 1))
@assert brief(permuted) == "4 x 3 x Float64 in Rows (Permute, Dense)"
@assert similar_array(permuted) !== permuted
@assert brief(similar_array(permuted)) == "4 x 3 x Float64 in Rows (Permute, Dense)"
# LinearAlgebra
transposed = transpose(base)
@assert brief(transposed) == "4 x 3 x Float64 in Rows (Transpose, Dense)"
@assert similar_array(transposed) !== transposed
@assert brief(similar_array(transposed)) == "4 x 3 x Float64 in Rows (Transpose, Dense)"
adjointed = adjoint(base)
@assert brief(adjointed) == "4 x 3 x Float64 in Rows (Adjoint, Dense)"
@assert similar_array(adjointed) !== adjointed
@assert brief(similar_array(adjointed)) == "4 x 3 x Float64 in Rows (Adjoint, Dense)"
# output
TanayLabUtilities.MatrixFormats.sparse_matrix_csc
—
Function
sparse_matrix_csc(
matrix::AbstractMatrix;
eltype::Maybe{Type} = nothing,
indtype::Maybe{Type} = nothing
)::SparseMatrixCSC
sparse_matrix_csc(
colptr::AbstractVector,
rowval::AbstractVector,
nzval::AbstractVector
)::Union{ReadOnlyArray, SparseMatrixCSC}
Create a sparse column-major matrix. This differs from the simple
SparseMatrixCSC
in the following ways:
-
The integer index type is
UInt32if possible. Only very large matrix sizes useUInt64. This greatly reduces the size of large matrices. -
If constructing the matrix from three vectors, then if any of them are
ReadOnlyArray, this will return aReadOnlyArraywrapper for the result (which will internally refer to the mutable arrays). -
If
eltypeis specified, this will be the element type of the result.
# Matrix
@assert brief(sparse_matrix_csc([0 1 2; 3 4 0])) == "2 x 3 x Int64 in Columns (Sparse 4 (67%) [UInt32])"
@assert brief(sparse_matrix_csc([0 1 2; 3 4 0]; eltype = Float32)) == "2 x 3 x Float32 in Columns (Sparse 4 (67%) [UInt32])"
@assert brief(sparse_matrix_csc([0 1 2; 3 4 0]; indtype = UInt8)) == "2 x 3 x Int64 in Columns (Sparse 4 (67%) [UInt8])"
# Vectors
sparse = sparse_matrix_csc([0 1 2; 3 4 0])
@assert brief(sparse_matrix_csc(2, 3, sparse.colptr, sparse.rowval, sparse.nzval)) == "2 x 3 x Int64 in Columns (Sparse 4 (67%) [UInt32])"
@assert brief(sparse_matrix_csc(2, 3, read_only_array(sparse.colptr), read_only_array(sparse.rowval), read_only_array(sparse.nzval))) ==
"2 x 3 x Int64 in Columns (ReadOnly, Sparse 4 (67%) [UInt32])";
# output
TanayLabUtilities.MatrixFormats.sparse_vector
—
Function
sparse_vector(
vector::AbstractMatrix;
eltype::Maybe{Type} = nothing,
indtype::Maybe{Type} = nothing,
)::SparseVector
sparse_vector(
size::Integer,
inzind::AbstractVector,
nzval::AbstractVector
)::Union{ReadOnlyArray, SparseVector}
Create a sparse vector. This differs from the simple
SparseVector
in the following ways:
-
The integer index type is
UInt32if possible. Only very large matrix sizes useUInt64. This greatly reduces the size of large matrices. -
If constructing the vector from two vectors, then if any of them are
ReadOnlyArray, this will return aReadOnlyArraywrapper for the result (which will internally refer to the mutable arrays). -
If
eltypeis specified, this will be the element type of the result.
# Vector
@assert brief(sparse_vector([0, 1, 2])) == "3 x Int64 (Sparse 2 (67%) [UInt32])"
@assert brief(sparse_vector([0, 1, 2]; eltype = Float32)) == "3 x Float32 (Sparse 2 (67%) [UInt32])"
# Vectors
@assert brief(sparse_vector(3, [1, 3], [1.0, 2.0])) == "3 x Float64 (Sparse 2 (67%) [Int64])"
@assert brief(sparse_vector(3, read_only_array([1, 3]), read_only_array([1.0, 2.0]))) == "3 x Float64 (ReadOnly, Sparse 2 (67%) [Int64])"
# output
TanayLabUtilities.MatrixFormats.sparse_mask_vector
—
Function
sparse_mask_vector(
size::Integer,
inzind::AbstractVector
)::Union{ReadOnlyArray, SparseVector{Bool}}
Create a sparse mask vector using only the indices of the
true
entries. Alas, this still needs to allocate a vector of
Bool
for the data.
@assert brief(sparse_mask_vector(3, [1, 3])) == "3 x Bool (Sparse 2 (67%) [Int64])"
@assert brief(sparse_mask_vector(3, read_only_array([1, 3]))) == "3 x Bool (ReadOnly, Sparse 2 (67%) [Int64])"
# output
TanayLabUtilities.MatrixFormats.dense_mask_vector
—
Function
dense_mask_vector(
size::Integer,
inzind::AbstractVector
)::Vector{Bool}
Create a dense mask vector using only the indices of the
true
entries.
println(brief(dense_mask_vector(4, [1, 3])))
# output
4 x Bool (Dense; 2 (50%) true)
TanayLabUtilities.MatrixFormats.embed_dense_matrix_in_sparse_matrix
—
Function
embed_dense_matrix_in_sparse_matrix(
matrix::AbstractMatrix{T};
rows_indices::AbstractVector{<:Integer},
n_rows::Integer,
columns_indices::AbstractVector{<:Integer},
n_columns::Integer,
)::SparseMatrixCSC{T} where {T}
Embed a dense
matrix
into a sparse matrix of size
n_rows
x
n_columns
. The dense matrix values are placed at the positions given by
rows_indices
(which must be sorted) and
columns_indices
(which must be sorted). All entries of the dense matrix are assumed to be non-zero.
The returned sparse matrix uses the same storage as the provided dense matrix. It just wraps it with appropriate indices to appear as a larger sparse matrix.
using SparseArrays
dense = rand(Float32, 3, 4)
sparse = embed_dense_matrix_in_sparse_matrix(dense; rows_indices = [1, 3, 5], n_rows = 5, columns_indices = [2, 3, 4, 6], n_columns = 6)
@assert all(sparse[[1, 3, 5], [2, 3, 4, 6]] .== dense)
@assert all(sparse[[2, 4], :] .== 0)
@assert all(sparse[:, [1, 5]] .== 0)
# output
TanayLabUtilities.MatrixFormats.embed_sparse_matrix_in_sparse_matrix
—
Function
embed_sparse_matrix_in_sparse_matrix(
matrix::SparseMatrixCSC{T};
rows_indices::AbstractVector{<:Integer},
n_rows::Integer,
columns_indices::AbstractVector{<:Integer},
n_columns::Integer,
)::SparseMatrixCSC{T} where {T}
Embed a sparse
matrix
into a larger sparse matrix of size
n_rows
x
n_columns
. The sparse matrix non-zero values are placed at the positions given by
rows_indices
(which must be sorted) and
columns_indices
(which must be sorted), which remap the row and column indices of the input matrix.
The returned sparse matrix uses the same
nzval
storage as the provided sparse matrix. Only the
colptr
and
rowval
arrays are newly allocated with remapped indices.
using SparseArrays
input = sparse([1, 2, 1], [1, 2, 3], Float32[10, 20, 30], 2, 3)
result = embed_sparse_matrix_in_sparse_matrix(input; rows_indices = [2, 4], n_rows = 5, columns_indices = [1, 3, 5], n_columns = 6)
@assert result[2, 1] == 10
@assert result[4, 3] == 20
@assert result[2, 5] == 30
@assert nnz(result) == 3
@assert size(result) == (5, 6)
# output
TanayLabUtilities.MatrixFormats.sparsify
—
Function
sparsify(
matrix::AbstractMatrix;
copy::Bool = false,
eltype::Maybe{Type} = nothing,
indtype::Maybe{Type} = nothing
)::AbstractMatrix
sparsify(
vector::AbstractVector;
copy::Bool = false,
eltype::Maybe{Type} = nothing,
indtype::Maybe{Type} = nothing
)::AbstractVector
Return a sparse version of an array, possibly forcing a different
eltype
and/or
indtype
. If given a dense matrix, the default
indtype
will be
indtype_for_size
for the matrix. This will preserve the matrix layout (for example,
sparsify
of a transposed matrix will be a transposed matrix). If
copy
, this will create a copy even if it is already sparse and has the correct
eltype
and
indtype
.
using SparseArrays
# Dense
dense = rand(3, 4)
@assert sparsify(dense) == dense
@assert brief(dense) == "3 x 4 x Float64 in Columns (Dense)"
@assert brief(sparsify(dense)) == "3 x 4 x Float64 in Columns (Sparse 12 (100%) [UInt32])"
# Sparse
sparse = SparseMatrixCSC([0 1 2; 3 4 0])
@assert sparsify(sparse) === sparse
@assert brief(sparse) == "2 x 3 x Int64 in Columns (Sparse 4 (67%) [Int64])"
@assert sparsify(sparse; copy = true) == sparse
@assert sparsify(sparse; copy = true) !== sparse
@assert brief(sparsify(sparse)) == "2 x 3 x Int64 in Columns (Sparse 4 (67%) [Int64])"
@assert sparsify(sparse; eltype = Int8) == sparse
@assert brief(sparsify(sparse; eltype = Int8)) == "2 x 3 x Int8 in Columns (Sparse 4 (67%) [Int64])"
@assert sparsify(sparse; indtype = Int8) == sparse
@assert brief(sparsify(sparse; indtype = Int8)) == "2 x 3 x Int64 in Columns (Sparse 4 (67%) [Int8])"
# ReadOnly
read_only = read_only_array(sparse)
@assert sparsify(read_only) === read_only
@assert brief(read_only) == "2 x 3 x Int64 in Columns (ReadOnly, Sparse 4 (67%) [Int64])"
read_only = read_only_array(dense)
@assert sparsify(read_only) == read_only
@assert brief(sparsify(read_only)) == "3 x 4 x Float64 in Columns (ReadOnly, Sparse 12 (100%) [UInt32])"
# Named
using NamedArrays
named = NamedArray(sparse)
@assert sparsify(named) === named
@assert brief(named) == "2 x 3 x Int64 in Columns (Named, Sparse 4 (67%) [Int64])"
named = NamedArray(dense)
@assert sparsify(named) == named
@assert brief(sparsify(named)) == "3 x 4 x Float64 in Columns (Named, Sparse 12 (100%) [UInt32])"
# Permuted
permuted = PermutedDimsArray(sparse, (2, 1))
@assert sparsify(permuted) === permuted
@assert brief(permuted) == "3 x 2 x Int64 in Rows (Permute, Sparse 4 (67%) [Int64])"
unpermuted = PermutedDimsArray(sparse, (1, 2))
@assert sparsify(unpermuted) === unpermuted
@assert brief(unpermuted) == "2 x 3 x Int64 in Columns (!Permute, Sparse 4 (67%) [Int64])"
permuted = PermutedDimsArray(dense, (2, 1))
@assert sparsify(permuted) == permuted
@assert brief(permuted) == "4 x 3 x Float64 in Rows (Permute, Dense)"
@assert brief(sparsify(permuted)) == "4 x 3 x Float64 in Rows (Permute, Sparse 12 (100%) [UInt32])"
unpermuted = PermutedDimsArray(dense, (1, 2))
@assert sparsify(unpermuted) == unpermuted
@assert brief(unpermuted) == "3 x 4 x Float64 in Columns (!Permute, Dense)"
@assert brief(sparsify(unpermuted)) == "3 x 4 x Float64 in Columns (!Permute, Sparse 12 (100%) [UInt32])"
# LinearAlgebra
transposed = transpose(sparse)
@assert sparsify(transposed) === transposed
@assert brief(transposed) == "3 x 2 x Int64 in Rows (Transpose, Sparse 4 (67%) [Int64])"
adjointed = adjoint(sparse)
@assert sparsify(adjointed) === adjointed
@assert brief(adjointed) == "3 x 2 x Int64 in Rows (Adjoint, Sparse 4 (67%) [Int64])"
transposed = transpose(dense)
@assert sparsify(transposed) == transposed
@assert brief(transposed) == "4 x 3 x Float64 in Rows (Transpose, Dense)"
@assert brief(sparsify(transposed)) == "4 x 3 x Float64 in Rows (Transpose, Sparse 12 (100%) [UInt32])"
adjointed = adjoint(dense)
@assert sparsify(adjointed) == adjointed
@assert brief(adjointed) == "4 x 3 x Float64 in Rows (Adjoint, Dense)"
@assert brief(sparsify(adjointed)) == "4 x 3 x Float64 in Rows (Adjoint, Sparse 12 (100%) [UInt32])"
# output
using SparseArrays
# Dense
dense = rand(4)
@assert sparsify(dense) == dense
@assert brief(dense) == "4 x Float64 (Dense)"
@assert brief(sparsify(dense)) == "4 x Float64 (Sparse 4 (100%) [UInt32])"
# Sparse
sparse = SparseVector([0, 1, 2, 0])
@assert sparsify(sparse) === sparse
@assert brief(sparse) == "4 x Int64 (Sparse 2 (50%) [Int64])"
@assert sparsify(sparse; copy = true) == sparse
@assert sparsify(sparse; copy = true) !== sparse
@assert brief(sparsify(sparse)) == "4 x Int64 (Sparse 2 (50%) [Int64])"
@assert sparsify(sparse; eltype = Int8) == sparse
@assert brief(sparsify(sparse; eltype = Int8)) == "4 x Int8 (Sparse 2 (50%) [Int64])"
@assert sparsify(sparse; indtype = Int8) == sparse
@assert brief(sparsify(sparse; indtype = Int8)) == "4 x Int64 (Sparse 2 (50%) [Int8])"
# output
TanayLabUtilities.MatrixFormats.densify
—
Function
densify(matrix::AbstractMatrix; copy::Bool = false, eltype::Maybe{Type} = nothing)::AbstractMatrix
densify(vector::AbstractVector; copy::Bool = false, eltype::Maybe{Type} = nothing)::AbstractVector
Return a dense version of an array, possibly forcing a different
eltype
. This will preserve the matrix layout (for example,
densify
of a transposed matrix will be a transposed matrix). If
copy
, this will create a copy even if it is already dense and has the correct
eltype
.
using SparseArrays
# Dense
dense = rand(3, 4)
@assert densify(dense) === dense
@assert brief(dense) == "3 x 4 x Float64 in Columns (Dense)"
@assert densify(dense; copy = true) !== dense
@assert densify(dense; copy = true) == dense
@assert brief(densify(dense; copy = true)) == "3 x 4 x Float64 in Columns (Dense)"
@assert isapprox(densify(dense; eltype = Float32), dense)
@assert brief(densify(dense; eltype = Float32)) == "3 x 4 x Float32 in Columns (Dense)"
# Sparse
sparse = SparseMatrixCSC([0 1 2; 3 4 0])
@assert densify(sparse) == sparse
@assert brief(densify(sparse)) == "2 x 3 x Int64 in Columns (Dense)"
@assert brief(densify(sparse; eltype = Int8)) == "2 x 3 x Int8 in Columns (Dense)"
# ReadOnly
read_only = read_only_array(sparse)
@assert densify(read_only) == read_only
@assert brief(read_only) == "2 x 3 x Int64 in Columns (ReadOnly, Sparse 4 (67%) [Int64])"
@assert brief(densify(read_only)) == "2 x 3 x Int64 in Columns (ReadOnly, Dense)"
read_only = read_only_array(dense)
@assert densify(read_only) == dense
# Named
using NamedArrays
named = NamedArray(sparse)
@assert densify(named) == named
@assert brief(named) == "2 x 3 x Int64 in Columns (Named, Sparse 4 (67%) [Int64])"
@assert brief(densify(named)) == "2 x 3 x Int64 in Columns (Named, Dense)"
named = NamedArray(dense)
@assert densify(named) == dense
# Permuted
permuted = PermutedDimsArray(dense, (2, 1))
@assert densify(permuted) === permuted
@assert brief(permuted) == "4 x 3 x Float64 in Rows (Permute, Dense)"
unpermuted = PermutedDimsArray(dense, (1, 2))
@assert densify(unpermuted) === unpermuted
@assert brief(unpermuted) == "3 x 4 x Float64 in Columns (!Permute, Dense)"
permuted = PermutedDimsArray(sparse, (2, 1))
@assert densify(permuted) == permuted
@assert brief(permuted) == "3 x 2 x Int64 in Rows (Permute, Sparse 4 (67%) [Int64])"
@assert brief(densify(permuted)) == "3 x 2 x Int64 in Rows (Permute, Dense)"
unpermuted = PermutedDimsArray(sparse, (1, 2))
@assert densify(unpermuted) == unpermuted
@assert brief(unpermuted) == "2 x 3 x Int64 in Columns (!Permute, Sparse 4 (67%) [Int64])"
@assert brief(densify(unpermuted)) == "2 x 3 x Int64 in Columns (!Permute, Dense)"
# LinearAlgebra
transposed = transpose(dense)
@assert densify(transposed) === transposed
@assert brief(transposed) == "4 x 3 x Float64 in Rows (Transpose, Dense)"
adjointed = adjoint(dense)
@assert densify(adjointed) === adjointed
@assert brief(adjointed) == "4 x 3 x Float64 in Rows (Adjoint, Dense)"
transposed = transpose(sparse)
@assert densify(transposed) == transposed
@assert brief(transposed) == "3 x 2 x Int64 in Rows (Transpose, Sparse 4 (67%) [Int64])"
@assert brief(densify(transposed)) == "3 x 2 x Int64 in Rows (Transpose, Dense)"
adjointed = adjoint(sparse)
@assert densify(adjointed) == adjointed
@assert brief(adjointed) == "3 x 2 x Int64 in Rows (Adjoint, Sparse 4 (67%) [Int64])"
@assert brief(densify(adjointed)) == "3 x 2 x Int64 in Rows (Adjoint, Dense)"
# output
using SparseArrays
# Sparse
sparse = SparseVector([0, 1, 2, 0])
@assert densify(sparse) == sparse
@assert brief(densify(sparse)) == "4 x Int64 (Dense)"
# Dense
dense = rand(4)
@assert densify(dense) === dense
@assert brief(dense) == "4 x Float64 (Dense)"
@assert densify(dense; copy = true) !== dense
@assert densify(dense; copy = true) == dense
@assert brief(densify(dense; copy = true)) == "4 x Float64 (Dense)"
@assert isapprox(densify(dense; eltype = Float32), dense)
@assert brief(densify(dense; eltype = Float32)) == "4 x Float32 (Dense)"
# output
TanayLabUtilities.MatrixFormats.bestify
—
Function
bestify(
matrix::AbstractMatrix;
min_sparse_saving_fraction::AbstractFloat = ```0.25```,
copy::Bool = false,
eltype::Maybe{Type} = nothing,
)::AbstractMatrix
bestify(
matrix::AbstractVector;
min_sparse_saving_fraction::AbstractFloat = ```0.25```,
copy::Bool = false,
eltype::Maybe{Type} = nothing,
)::AbstractVector
Return a "best" (dense or sparse) version of an array. The sparse format is chosen if it saves at least
min_sparse_saving_fraction
of the storage of the dense format. If
copy
, this will create a copy even if it is already in the best format.
If
eltype
is specified, computes the savings (and create the "best" version) using this element type. In addition, if given a sparse matrix, we consider the
indtype_for_size
for it, and if that saves
min_sparse_saving_fraction
relative to the current sparse representation, we'll create a new one using the better (smaller)
indtype
.
using LinearAlgebra
# Dense
dense = zeros(Int32, 5, 5)
view(dense, diagind(dense)) .= 1
@assert bestify(dense) == dense
@assert brief(bestify(dense)) == "5 x 5 x Int32 in Columns (Sparse 5 (20%) [UInt32])"
@assert bestify(dense; min_sparse_saving_fraction = 0.5) === dense
# Sparse
sparse = sparse_matrix_csc(dense)
@assert bestify(sparse) === sparse
@assert brief(sparse) == "5 x 5 x Int32 in Columns (Sparse 5 (20%) [UInt32])"
# ReadOnly
read_only = read_only_array(dense)
@assert bestify(read_only; min_sparse_saving_fraction = 0.5) === read_only
@assert brief(read_only) == "5 x 5 x Int32 in Columns (ReadOnly, Dense)"
@assert bestify(read_only) == read_only
@assert brief(bestify(read_only)) == "5 x 5 x Int32 in Columns (ReadOnly, Sparse 5 (20%) [UInt32])"
read_only = read_only_array(sparse)
@assert bestify(read_only) === read_only
@assert brief(read_only) == "5 x 5 x Int32 in Columns (ReadOnly, Sparse 5 (20%) [UInt32])"
@assert bestify(read_only; min_sparse_saving_fraction = 0.5) == read_only
@assert brief(bestify(read_only; min_sparse_saving_fraction = 0.5)) == "5 x 5 x Int32 in Columns (ReadOnly, Dense)"
# Named
using NamedArrays
named = NamedArray(dense)
@assert bestify(named; min_sparse_saving_fraction = 0.5) === named
@assert brief(named) == "5 x 5 x Int32 in Columns (Named, Dense)"
@assert bestify(named) == named
@assert brief(bestify(named)) == "5 x 5 x Int32 in Columns (Named, Sparse 5 (20%) [UInt32])"
named = NamedArray(sparse)
@assert bestify(named) === named
@assert brief(named) == "5 x 5 x Int32 in Columns (Named, Sparse 5 (20%) [UInt32])"
@assert bestify(named; min_sparse_saving_fraction = 0.5) == named
@assert brief(bestify(named; min_sparse_saving_fraction = 0.5)) == "5 x 5 x Int32 in Columns (Named, Dense)"
# Permuted
permuted = PermutedDimsArray(dense, (2, 1))
@assert bestify(permuted; min_sparse_saving_fraction = 0.5) === permuted
@assert brief(permuted) == "5 x 5 x Int32 in Rows (Permute, Dense)"
@assert bestify(permuted) == permuted
@assert brief(bestify(permuted)) == "5 x 5 x Int32 in Rows (Permute, Sparse 5 (20%) [UInt32])"
permuted = PermutedDimsArray(sparse, (1, 2))
@assert bestify(permuted) === permuted
@assert brief(permuted) == "5 x 5 x Int32 in Columns (!Permute, Sparse 5 (20%) [UInt32])"
@assert bestify(permuted; min_sparse_saving_fraction = 0.5) == permuted
@assert brief(bestify(permuted; min_sparse_saving_fraction = 0.5)) == "5 x 5 x Int32 in Columns (!Permute, Dense)"
# LinearAlgebra
transposed = transpose(dense)
@assert bestify(transposed; min_sparse_saving_fraction = 0.5) === transposed
@assert brief(transposed) == "5 x 5 x Int32 in Rows (Transpose, Dense)"
@assert bestify(transposed) == transposed
@assert brief(bestify(transposed)) == "5 x 5 x Int32 in Rows (Transpose, Sparse 5 (20%) [UInt32])"
adjointed = adjoint(sparse)
@assert bestify(adjointed) === adjointed
@assert brief(adjointed) == "5 x 5 x Int32 in Rows (Adjoint, Sparse 5 (20%) [UInt32])"
@assert bestify(adjointed; min_sparse_saving_fraction = 0.5) == adjointed
@assert brief(bestify(adjointed; min_sparse_saving_fraction = 0.5)) == "5 x 5 x Int32 in Rows (Adjoint, Dense)"
# output
using LinearAlgebra
# Dense
dense = zeros(Int32, 3)
dense[1] = 1
@assert bestify(dense) == dense
@assert brief(bestify(dense)) == "3 x Int32 (Sparse 1 (33%) [UInt32])"
@assert bestify(dense; min_sparse_saving_fraction = 0.5) === dense
# Sparse
sparse = sparse_vector(dense)
@assert bestify(sparse) === sparse
@assert brief(sparse) == "3 x Int32 (Sparse 1 (33%) [UInt32])"
# output
TanayLabUtilities.MatrixFormats.indtype_for_size
—
Function
indtype_for_size(size::Integer)::Type
Return the integer data type which is large enough to hold indices and offsets for a
SparseMatrixCSC
matrix of some
size
(total number of elements). We try to use
UInt32
whenever possible because for large matrices (especially with 32-bit value types) this will drastically reduce the amount of space used.
println(10000000 => indtype_for_size(10000000))
println(10000000000 => indtype_for_size(10000000000))
# output
10000000 => UInt32
10000000000 => UInt64
TanayLabUtilities.MatrixFormats.colptr
—
Function
colptr(sparse::AbstractMatrix)::AbstractVector{<:Integer}
Return the
colptr
of a
sparse
matrix.
using NamedArrays
using SparseArrays
sparse_matrix = SparseMatrixCSC([0 1 2; 3 4 0])
@assert colptr(sparse_matrix) === sparse_matrix.colptr;
@assert colptr(read_only_array(sparse_matrix)) === sparse_matrix.colptr;
@assert colptr(NamedArray(sparse_matrix)) === sparse_matrix.colptr;
# output
TanayLabUtilities.MatrixFormats.rowval
—
Function
rowval(sparse::AbstractArray)::AbstractVector{<Integer}
Return the
rowval
of a
sparse
array.
using NamedArrays
using SparseArrays
sparse_matrix = SparseMatrixCSC([0 1 2; 3 4 0])
@assert rowval(sparse_matrix) === sparse_matrix.rowval;
@assert rowval(read_only_array(sparse_matrix)) === sparse_matrix.rowval;
@assert rowval(NamedArray(sparse_matrix)) === sparse_matrix.rowval;
# output
TanayLabUtilities.MatrixFormats.nzind
—
Function
nzind(sparse::AbstractVector)::AbstractVector{<:Integer}
Return the
nzind
of a
sparse
vector.
using NamedArrays
using SparseArrays
sparse_vector = SparseVector([0, 1, 2])
@assert nzind(sparse_vector) === sparse_vector.nzind;
@assert nzind(read_only_array(sparse_vector)) === sparse_vector.nzind;
@assert nzind(NamedArray(sparse_vector)) === sparse_vector.nzind;
# output
TanayLabUtilities.MatrixFormats.nzval
—
Function
nzval(sparse::AbstractArray)::AbstractVector
Return the
nzval
of a
sparse
array.
using NamedArrays
using SparseArrays
sparse_matrix = SparseMatrixCSC([0 1 2; 3 4 0])
@assert nzval(sparse_matrix) === sparse_matrix.nzval;
@assert nzval(read_only_array(sparse_matrix)) === sparse_matrix.nzval;
@assert nzval(NamedArray(sparse_matrix)) === sparse_matrix.nzval;
sparse_vector = SparseVector([0, 1, 2])
@assert nzval(sparse_vector) === sparse_vector.nzval;
@assert nzval(read_only_array(sparse_vector)) === sparse_vector.nzval;
@assert nzval(NamedArray(sparse_vector)) === sparse_vector.nzval;
# output
Index
-
TanayLabUtilities.MatrixFormats -
TanayLabUtilities.MatrixFormats.bestify -
TanayLabUtilities.MatrixFormats.colptr -
TanayLabUtilities.MatrixFormats.copy_array -
TanayLabUtilities.MatrixFormats.dense_mask_vector -
TanayLabUtilities.MatrixFormats.densify -
TanayLabUtilities.MatrixFormats.embed_dense_matrix_in_sparse_matrix -
TanayLabUtilities.MatrixFormats.embed_sparse_matrix_in_sparse_matrix -
TanayLabUtilities.MatrixFormats.indtype_for_size -
TanayLabUtilities.MatrixFormats.nzind -
TanayLabUtilities.MatrixFormats.nzval -
TanayLabUtilities.MatrixFormats.rowval -
TanayLabUtilities.MatrixFormats.similar_array -
TanayLabUtilities.MatrixFormats.sparse_mask_vector -
TanayLabUtilities.MatrixFormats.sparse_matrix_csc -
TanayLabUtilities.MatrixFormats.sparse_vector -
TanayLabUtilities.MatrixFormats.sparsify