Matrix Formats

TanayLabUtilities.MatrixFormats Module

Deal with (some) of the matrix formats. This obviously can't be compherensive but it should cover the matrix types we have encountered so far and hopefully falls back to reasonable defaults for more exotic matrix types.

In Julia, many array types are wrappers around "parent" arrays. The specific wrappers we deal with in most cases are NamedArray which adds names to the rows and/or columns, PermutedDimsArray which flips the order of the axes, Transpose and Adjoint which likewise flip the axes ( Adjoint also transforms complex values), and ReadOnlyArray which prevents mutating the array. And then there are more transformative wrappers such as SubArray , SparseVector and SparseMatrixCSC , PyArray , etc.

This makes life difficult. Specifically, you can't rely (much) on the type system to separate code dealing with different array types. For example, not all issparse arrays derive from AbstractSparseArray (because you might have a sparse array wrapped in something). It would have been great if there were isdense and isstrided functions to match and libraries actually used them to trigger optimized code but "that would have been too easy".

The code here tries to put this under some control so we can write robust code which "does the right thing", in most cases, at least when it comes to converting between formats. This means we are forced to provide alternatives to some built-in functions (for example, copying arrays). Sigh.

TanayLabUtilities.MatrixFormats.copy_array Function
copy_array(array::AbstractArray; eltype::Maybe{Type} = nothing, indtype::Maybe{Type} = nothing)::AbstractArray

Create a copy of an array. This differs from Base.copy in the following:

  • Copying a read-only array returns a mutable array. In contrast, both Base.copy and Base.deepcopy of a ReadOnlyArray array will return a ReadOnlyArray array, which is technically correct, but is rather pointless.

  • Copying a NamedArray returns a NamedArray that shares the names (but not the data storage).

  • Copying will preserve the layout of the data; for example, copying a Transpose array is still a Transpose array. In contrast, while Base.deepcopy will preserve the layout, Base.copy will silently relayout the matrix, which is both expensive and unexpected.

  • Copying a sparse vector or matrix gives a sparse result. Copying anything else gives a simple dense array regardless of the original type. This is done because a deepcopy of PyArray will still share the underlying buffer, which removes the whole point of doing a copy. Sigh.

  • Copying a vector of anything derived from AbstractString returns a vector of AbstractString .

  • You can override the eltype of the array (and/or the indtype , if it is sparse).

base = [0 1 2; 3 4 0]

# Dense

@assert brief(base) == "2 x 3 x Int64 in Columns (Dense)"
@assert brief(copy_array(base)) == "2 x 3 x Int64 in Columns (Dense)"
@assert copy_array(base) == base
@assert copy_array(base) !== base

@assert copy_array(base; eltype = Int32) == base
@assert brief(copy_array(base; eltype = Int32)) == "2 x 3 x Int32 in Columns (Dense)"

# Sparse

using SparseArrays

sparse = SparseMatrixCSC(base)
@assert copy_array(sparse) == sparse
@assert copy_array(sparse) !== sparse
@assert brief(sparse) == "2 x 3 x Int64 in Columns (Sparse 4 (67%) [Int64])"
@assert brief(copy_array(sparse)) == "2 x 3 x Int64 in Columns (Sparse 4 (67%) [Int64])"

@assert copy_array(sparse; eltype = Int32) == sparse
@assert brief(copy_array(sparse; eltype = Int32)) == "2 x 3 x Int32 in Columns (Sparse 4 (67%) [Int64])"

@assert copy_array(sparse; indtype = Int8) == sparse
@assert brief(copy_array(sparse; indtype = Int8)) == "2 x 3 x Int64 in Columns (Sparse 4 (67%) [Int8])"

# ReadOnly

read_only = read_only_array(base)
@assert brief(read_only) == "2 x 3 x Int64 in Columns (ReadOnly, Dense)"
@assert brief(copy_array(read_only)) == "2 x 3 x Int64 in Columns (Dense)"
@assert copy_array(read_only) == read_only
@assert copy_array(read_only) !== base

# Named

using NamedArrays

named = NamedArray(base)
@assert brief(named) == "2 x 3 x Int64 in Columns (Named, Dense)"
@assert brief(copy_array(named)) == "2 x 3 x Int64 in Columns (Named, Dense)"
@assert copy_array(named) == named
@assert parent(copy_array(named)) !== base

# Permuted

permuted = PermutedDimsArray(base, (2, 1))
@assert brief(permuted) == "3 x 2 x Int64 in Rows (Permute, Dense)"
@assert brief(copy_array(permuted)) == "3 x 2 x Int64 in Rows (Permute, Dense)"
@assert copy_array(permuted) == permuted
@assert parent(copy_array(permuted)) !== base

unpermuted = PermutedDimsArray(base, (1, 2))
@assert brief(unpermuted) == "2 x 3 x Int64 in Columns (!Permute, Dense)"
@assert brief(copy_array(unpermuted)) == "2 x 3 x Int64 in Columns (!Permute, Dense)"
@assert copy_array(unpermuted) == unpermuted
@assert parent(copy_array(unpermuted)) !== base

# LinearAlgebra

using LinearAlgebra

transposed = transpose(base)
@assert brief(transposed) == "3 x 2 x Int64 in Rows (Transpose, Dense)"
@assert brief(copy_array(transposed)) == "3 x 2 x Int64 in Rows (Transpose, Dense)"
@assert copy_array(transposed) == transposed
@assert parent(copy_array(transposed)) !== base

adjointed = adjoint(base)
@assert brief(adjointed) == "3 x 2 x Int64 in Rows (Adjoint, Dense)"
@assert brief(copy_array(adjointed)) == "3 x 2 x Int64 in Rows (Adjoint, Dense)"
@assert copy_array(adjointed) == adjointed
@assert parent(copy_array(adjointed)) !== base

# output


# Dense

base = [0, 1, 2]

@assert brief(base) == "3 x Int64 (Dense)"
@assert brief(copy_array(base)) == "3 x Int64 (Dense)"
@assert copy_array(base) == base
@assert copy_array(base) !== base

# Sparse

using SparseArrays

sparse = SparseVector(base)
@assert brief(sparse) == "3 x Int64 (Sparse 2 (67%) [Int64])"
@assert brief(copy_array(sparse)) == "3 x Int64 (Sparse 2 (67%) [Int64])"
@assert copy_array(sparse) == sparse
@assert copy_array(sparse) !== sparse

# ReadOnly

read_only = read_only_array(base)
@assert brief(read_only) == "3 x Int64 (ReadOnly, Dense)"
@assert brief(copy_array(read_only)) == "3 x Int64 (Dense)"
@assert copy_array(read_only) == read_only
@assert copy_array(read_only) !== base

# Named

using NamedArrays

named = NamedArray(base)
@assert brief(named) == "3 x Int64 (Named, Dense)"
@assert brief(copy_array(named)) == "3 x Int64 (Named, Dense)"
@assert copy_array(named) == named
@assert parent(copy_array(named)) !== base

# LinearAlgebra

using LinearAlgebra

transposed = transpose(base)
@assert brief(transposed) == "3 x Int64 (Transpose, Dense)"
@assert brief(copy_array(transposed)) == "3 x Int64 (Transpose, Dense)"
@assert copy_array(transposed) == transposed
@assert parent(copy_array(transposed)) !== base

adjointed = adjoint(base)
@assert brief(adjointed) == "3 x Int64 (Adjoint, Dense)"
@assert brief(copy_array(adjointed)) == "3 x Int64 (Adjoint, Dense)"
@assert copy_array(adjointed) == adjointed
@assert parent(copy_array(adjointed)) !== base

# String

base = split("abc", "")

@assert brief(base) == "3 x Str (Dense)"
@assert brief(copy_array(base)) == "3 x Str (Dense)"
@assert eltype(base) != AbstractString
@assert eltype(copy_array(base)) == AbstractString
@assert copy_array(base) == base

# output


TanayLabUtilities.MatrixFormats.similar_array Function
similar_array(
    array::AbstractArray;
    [value::Any = undef,
    eltype::Maybe{Type} = nothing,
    default_major_axis::Maybe{Integer} = Columns]
)::AbstractArray
end

Return an array (vector or a matrix) similar to the given one. By default the data has the same eltype as the original, and is uninitialized unless you specify a value . The returned data is always dense ( Vector or Matrix ).

This is different from similar in that it will preserve the layout of a matrix (for example, similar_array of a transpose will also be a transpose ). Also, similar_array of a NamedArray will be another NamedArray sharing the axes with the original, and ReadOnlyArray wrappers are stripped from the result. If the array is a matrix with no clear major_axis , such as a @views slice of a matrix, then the result will have the default_major_axis .

base = rand(3, 4)

@assert brief(base) == "3 x 4 x Float64 in Columns (Dense)"
@assert similar_array(base) !== base
@assert brief(similar_array(base)) == "3 x 4 x Float64 in Columns (Dense)"

@assert brief(similar_array(base; eltype = Int32)) == "3 x 4 x Int32 in Columns (Dense)"
@assert brief(similar_array(base; value = 0.0)) == "3 x 4 x Float64 in Columns (Dense)"
@assert all(similar_array(base; value = 0.0) .== 0)

# ReadOnly

read_only = read_only_array(base)
@assert brief(read_only) == "3 x 4 x Float64 in Columns (ReadOnly, Dense)"
@assert brief(similar_array(read_only)) == "3 x 4 x Float64 in Columns (Dense)"

# Named

using NamedArrays

named = NamedArray(base)
@assert brief(named) == "3 x 4 x Float64 in Columns (Named, Dense)"
@assert similar_array(named) !== named
@assert brief(similar_array(named)) == "3 x 4 x Float64 in Columns (Named, Dense)"

# Permuted

permuted = PermutedDimsArray(base, (2, 1))
@assert brief(permuted) == "4 x 3 x Float64 in Rows (Permute, Dense)"
@assert similar_array(permuted) !== permuted
@assert brief(similar_array(permuted)) == "4 x 3 x Float64 in Rows (Permute, Dense)"

# LinearAlgebra

transposed = transpose(base)
@assert brief(transposed) == "4 x 3 x Float64 in Rows (Transpose, Dense)"
@assert similar_array(transposed) !== transposed
@assert brief(similar_array(transposed)) == "4 x 3 x Float64 in Rows (Transpose, Dense)"

adjointed = adjoint(base)
@assert brief(adjointed) == "4 x 3 x Float64 in Rows (Adjoint, Dense)"
@assert similar_array(adjointed) !== adjointed
@assert brief(similar_array(adjointed)) == "4 x 3 x Float64 in Rows (Adjoint, Dense)"

# output


TanayLabUtilities.MatrixFormats.sparse_matrix_csc Function
sparse_matrix_csc(
    matrix::AbstractMatrix;
    eltype::Maybe{Type} = nothing,
    indtype::Maybe{Type} = nothing
)::SparseMatrixCSC

sparse_matrix_csc(
    colptr::AbstractVector,
    rowval::AbstractVector,
    nzval::AbstractVector
)::Union{ReadOnlyArray, SparseMatrixCSC}

Create a sparse column-major matrix. This differs from the simple SparseMatrixCSC in the following ways:

  • The integer index type is UInt32 if possible. Only very large matrix sizes use UInt64 . This greatly reduces the size of large matrices.

  • If constructing the matrix from three vectors, then if any of them are ReadOnlyArray , this will return a ReadOnlyArray wrapper for the result (which will internally refer to the mutable arrays).

  • If eltype is specified, this will be the element type of the result.

# Matrix

@assert brief(sparse_matrix_csc([0 1 2; 3 4 0])) == "2 x 3 x Int64 in Columns (Sparse 4 (67%) [UInt32])"
@assert brief(sparse_matrix_csc([0 1 2; 3 4 0]; eltype = Float32)) == "2 x 3 x Float32 in Columns (Sparse 4 (67%) [UInt32])"
@assert brief(sparse_matrix_csc([0 1 2; 3 4 0]; indtype = UInt8)) == "2 x 3 x Int64 in Columns (Sparse 4 (67%) [UInt8])"

# Vectors

sparse = sparse_matrix_csc([0 1 2; 3 4 0])

@assert brief(sparse_matrix_csc(2, 3, sparse.colptr, sparse.rowval, sparse.nzval)) == "2 x 3 x Int64 in Columns (Sparse 4 (67%) [UInt32])"
@assert brief(sparse_matrix_csc(2, 3, read_only_array(sparse.colptr), read_only_array(sparse.rowval), read_only_array(sparse.nzval))) ==
      "2 x 3 x Int64 in Columns (ReadOnly, Sparse 4 (67%) [UInt32])";

# output


TanayLabUtilities.MatrixFormats.sparse_vector Function
sparse_vector(
    vector::AbstractMatrix;
    eltype::Maybe{Type} = nothing,
    indtype::Maybe{Type} = nothing,
)::SparseVector

sparse_vector(
    size::Integer,
    inzind::AbstractVector,
    nzval::AbstractVector
)::Union{ReadOnlyArray, SparseVector}

Create a sparse vector. This differs from the simple SparseVector in the following ways:

  • The integer index type is UInt32 if possible. Only very large matrix sizes use UInt64 . This greatly reduces the size of large matrices.

  • If constructing the vector from two vectors, then if any of them are ReadOnlyArray , this will return a ReadOnlyArray wrapper for the result (which will internally refer to the mutable arrays).

  • If eltype is specified, this will be the element type of the result.

# Vector

@assert brief(sparse_vector([0, 1, 2])) == "3 x Int64 (Sparse 2 (67%) [UInt32])"
@assert brief(sparse_vector([0, 1, 2]; eltype = Float32)) == "3 x Float32 (Sparse 2 (67%) [UInt32])"

# Vectors

@assert brief(sparse_vector(3, [1, 3], [1.0, 2.0])) == "3 x Float64 (Sparse 2 (67%) [Int64])"
@assert brief(sparse_vector(3, read_only_array([1, 3]), read_only_array([1.0, 2.0]))) == "3 x Float64 (ReadOnly, Sparse 2 (67%) [Int64])"

# output


TanayLabUtilities.MatrixFormats.sparse_mask_vector Function
sparse_mask_vector(
    size::Integer,
    inzind::AbstractVector
)::Union{ReadOnlyArray, SparseVector{Bool}}

Create a sparse mask vector using only the indices of the true entries. Alas, this still needs to allocate a vector of Bool for the data.

@assert brief(sparse_mask_vector(3, [1, 3])) == "3 x Bool (Sparse 2 (67%) [Int64])"
@assert brief(sparse_mask_vector(3, read_only_array([1, 3]))) == "3 x Bool (ReadOnly, Sparse 2 (67%) [Int64])"

# output


TanayLabUtilities.MatrixFormats.dense_mask_vector Function
dense_mask_vector(
    size::Integer,
    inzind::AbstractVector
)::Vector{Bool}

Create a dense mask vector using only the indices of the true entries.

println(brief(dense_mask_vector(4, [1, 3])))

# output

4 x Bool (Dense; 2 (50%) true)

TanayLabUtilities.MatrixFormats.embed_dense_matrix_in_sparse_matrix Function
embed_dense_matrix_in_sparse_matrix(
    matrix::AbstractMatrix{T};
    rows_indices::AbstractVector{<:Integer},
    n_rows::Integer,
    columns_indices::AbstractVector{<:Integer},
    n_columns::Integer,
)::SparseMatrixCSC{T} where {T}

Embed a dense matrix into a sparse matrix of size n_rows x n_columns . The dense matrix values are placed at the positions given by rows_indices (which must be sorted) and columns_indices (which must be sorted). All entries of the dense matrix are assumed to be non-zero.

Note

The returned sparse matrix uses the same storage as the provided dense matrix. It just wraps it with appropriate indices to appear as a larger sparse matrix.

using SparseArrays

dense = rand(Float32, 3, 4)
sparse = embed_dense_matrix_in_sparse_matrix(dense; rows_indices = [1, 3, 5], n_rows = 5, columns_indices = [2, 3, 4, 6], n_columns = 6)

@assert all(sparse[[1, 3, 5], [2, 3, 4, 6]] .== dense)
@assert all(sparse[[2, 4], :] .== 0)
@assert all(sparse[:, [1, 5]] .== 0)

# output


TanayLabUtilities.MatrixFormats.embed_sparse_matrix_in_sparse_matrix Function
embed_sparse_matrix_in_sparse_matrix(
    matrix::SparseMatrixCSC{T};
    rows_indices::AbstractVector{<:Integer},
    n_rows::Integer,
    columns_indices::AbstractVector{<:Integer},
    n_columns::Integer,
)::SparseMatrixCSC{T} where {T}

Embed a sparse matrix into a larger sparse matrix of size n_rows x n_columns . The sparse matrix non-zero values are placed at the positions given by rows_indices (which must be sorted) and columns_indices (which must be sorted), which remap the row and column indices of the input matrix.

Note

The returned sparse matrix uses the same nzval storage as the provided sparse matrix. Only the colptr and rowval arrays are newly allocated with remapped indices.

using SparseArrays

input = sparse([1, 2, 1], [1, 2, 3], Float32[10, 20, 30], 2, 3)
result = embed_sparse_matrix_in_sparse_matrix(input; rows_indices = [2, 4], n_rows = 5, columns_indices = [1, 3, 5], n_columns = 6)

@assert result[2, 1] == 10
@assert result[4, 3] == 20
@assert result[2, 5] == 30
@assert nnz(result) == 3
@assert size(result) == (5, 6)

# output


TanayLabUtilities.MatrixFormats.sparsify Function
sparsify(
    matrix::AbstractMatrix;
    copy::Bool = false,
    eltype::Maybe{Type} = nothing,
    indtype::Maybe{Type} = nothing
)::AbstractMatrix

sparsify(
    vector::AbstractVector;
    copy::Bool = false,
    eltype::Maybe{Type} = nothing,
    indtype::Maybe{Type} = nothing
)::AbstractVector

Return a sparse version of an array, possibly forcing a different eltype and/or indtype . If given a dense matrix, the default indtype will be indtype_for_size for the matrix. This will preserve the matrix layout (for example, sparsify of a transposed matrix will be a transposed matrix). If copy , this will create a copy even if it is already sparse and has the correct eltype and indtype .

using SparseArrays

# Dense

dense = rand(3, 4)
@assert sparsify(dense) == dense
@assert brief(dense) == "3 x 4 x Float64 in Columns (Dense)"
@assert brief(sparsify(dense)) == "3 x 4 x Float64 in Columns (Sparse 12 (100%) [UInt32])"

# Sparse

sparse = SparseMatrixCSC([0 1 2; 3 4 0])
@assert sparsify(sparse) === sparse
@assert brief(sparse) == "2 x 3 x Int64 in Columns (Sparse 4 (67%) [Int64])"

@assert sparsify(sparse; copy = true) == sparse
@assert sparsify(sparse; copy = true) !== sparse
@assert brief(sparsify(sparse)) == "2 x 3 x Int64 in Columns (Sparse 4 (67%) [Int64])"

@assert sparsify(sparse; eltype = Int8) == sparse
@assert brief(sparsify(sparse; eltype = Int8)) == "2 x 3 x Int8 in Columns (Sparse 4 (67%) [Int64])"

@assert sparsify(sparse; indtype = Int8) == sparse
@assert brief(sparsify(sparse; indtype = Int8)) == "2 x 3 x Int64 in Columns (Sparse 4 (67%) [Int8])"

# ReadOnly

read_only = read_only_array(sparse)
@assert sparsify(read_only) === read_only
@assert brief(read_only) == "2 x 3 x Int64 in Columns (ReadOnly, Sparse 4 (67%) [Int64])"

read_only = read_only_array(dense)
@assert sparsify(read_only) == read_only
@assert brief(sparsify(read_only)) == "3 x 4 x Float64 in Columns (ReadOnly, Sparse 12 (100%) [UInt32])"

# Named

using NamedArrays

named = NamedArray(sparse)
@assert sparsify(named) === named
@assert brief(named) == "2 x 3 x Int64 in Columns (Named, Sparse 4 (67%) [Int64])"

named = NamedArray(dense)
@assert sparsify(named) == named
@assert brief(sparsify(named)) == "3 x 4 x Float64 in Columns (Named, Sparse 12 (100%) [UInt32])"

# Permuted

permuted = PermutedDimsArray(sparse, (2, 1))
@assert sparsify(permuted) === permuted
@assert brief(permuted) == "3 x 2 x Int64 in Rows (Permute, Sparse 4 (67%) [Int64])"

unpermuted = PermutedDimsArray(sparse, (1, 2))
@assert sparsify(unpermuted) === unpermuted
@assert brief(unpermuted) == "2 x 3 x Int64 in Columns (!Permute, Sparse 4 (67%) [Int64])"

permuted = PermutedDimsArray(dense, (2, 1))
@assert sparsify(permuted) == permuted
@assert brief(permuted) == "4 x 3 x Float64 in Rows (Permute, Dense)"
@assert brief(sparsify(permuted)) == "4 x 3 x Float64 in Rows (Permute, Sparse 12 (100%) [UInt32])"

unpermuted = PermutedDimsArray(dense, (1, 2))
@assert sparsify(unpermuted) == unpermuted
@assert brief(unpermuted) == "3 x 4 x Float64 in Columns (!Permute, Dense)"
@assert brief(sparsify(unpermuted)) == "3 x 4 x Float64 in Columns (!Permute, Sparse 12 (100%) [UInt32])"

# LinearAlgebra

transposed = transpose(sparse)
@assert sparsify(transposed) === transposed
@assert brief(transposed) == "3 x 2 x Int64 in Rows (Transpose, Sparse 4 (67%) [Int64])"

adjointed = adjoint(sparse)
@assert sparsify(adjointed) === adjointed
@assert brief(adjointed) == "3 x 2 x Int64 in Rows (Adjoint, Sparse 4 (67%) [Int64])"

transposed = transpose(dense)
@assert sparsify(transposed) == transposed
@assert brief(transposed) == "4 x 3 x Float64 in Rows (Transpose, Dense)"
@assert brief(sparsify(transposed)) == "4 x 3 x Float64 in Rows (Transpose, Sparse 12 (100%) [UInt32])"

adjointed = adjoint(dense)
@assert sparsify(adjointed) == adjointed
@assert brief(adjointed) == "4 x 3 x Float64 in Rows (Adjoint, Dense)"
@assert brief(sparsify(adjointed)) == "4 x 3 x Float64 in Rows (Adjoint, Sparse 12 (100%) [UInt32])"

# output


using SparseArrays

# Dense

dense = rand(4)
@assert sparsify(dense) == dense
@assert brief(dense) == "4 x Float64 (Dense)"
@assert brief(sparsify(dense)) == "4 x Float64 (Sparse 4 (100%) [UInt32])"

# Sparse

sparse = SparseVector([0, 1, 2, 0])
@assert sparsify(sparse) === sparse
@assert brief(sparse) == "4 x Int64 (Sparse 2 (50%) [Int64])"

@assert sparsify(sparse; copy = true) == sparse
@assert sparsify(sparse; copy = true) !== sparse
@assert brief(sparsify(sparse)) == "4 x Int64 (Sparse 2 (50%) [Int64])"

@assert sparsify(sparse; eltype = Int8) == sparse
@assert brief(sparsify(sparse; eltype = Int8)) == "4 x Int8 (Sparse 2 (50%) [Int64])"

@assert sparsify(sparse; indtype = Int8) == sparse
@assert brief(sparsify(sparse; indtype = Int8)) == "4 x Int64 (Sparse 2 (50%) [Int8])"

# output


TanayLabUtilities.MatrixFormats.densify Function
densify(matrix::AbstractMatrix; copy::Bool = false, eltype::Maybe{Type} = nothing)::AbstractMatrix
densify(vector::AbstractVector; copy::Bool = false, eltype::Maybe{Type} = nothing)::AbstractVector

Return a dense version of an array, possibly forcing a different eltype . This will preserve the matrix layout (for example, densify of a transposed matrix will be a transposed matrix). If copy , this will create a copy even if it is already dense and has the correct eltype .

using SparseArrays

# Dense

dense = rand(3, 4)
@assert densify(dense) === dense
@assert brief(dense) == "3 x 4 x Float64 in Columns (Dense)"

@assert densify(dense; copy = true) !== dense
@assert densify(dense; copy = true) == dense
@assert brief(densify(dense; copy = true)) == "3 x 4 x Float64 in Columns (Dense)"

@assert isapprox(densify(dense; eltype = Float32), dense)
@assert brief(densify(dense; eltype = Float32)) == "3 x 4 x Float32 in Columns (Dense)"

# Sparse

sparse = SparseMatrixCSC([0 1 2; 3 4 0])

@assert densify(sparse) == sparse
@assert brief(densify(sparse)) == "2 x 3 x Int64 in Columns (Dense)"
@assert brief(densify(sparse; eltype = Int8)) == "2 x 3 x Int8 in Columns (Dense)"

# ReadOnly

read_only = read_only_array(sparse)
@assert densify(read_only) == read_only
@assert brief(read_only) == "2 x 3 x Int64 in Columns (ReadOnly, Sparse 4 (67%) [Int64])"
@assert brief(densify(read_only)) == "2 x 3 x Int64 in Columns (ReadOnly, Dense)"

read_only = read_only_array(dense)
@assert densify(read_only) == dense

# Named

using NamedArrays

named = NamedArray(sparse)
@assert densify(named) == named
@assert brief(named) == "2 x 3 x Int64 in Columns (Named, Sparse 4 (67%) [Int64])"
@assert brief(densify(named)) == "2 x 3 x Int64 in Columns (Named, Dense)"

named = NamedArray(dense)
@assert densify(named) == dense

# Permuted

permuted = PermutedDimsArray(dense, (2, 1))
@assert densify(permuted) === permuted
@assert brief(permuted) == "4 x 3 x Float64 in Rows (Permute, Dense)"

unpermuted = PermutedDimsArray(dense, (1, 2))
@assert densify(unpermuted) === unpermuted
@assert brief(unpermuted) == "3 x 4 x Float64 in Columns (!Permute, Dense)"

permuted = PermutedDimsArray(sparse, (2, 1))
@assert densify(permuted) == permuted
@assert brief(permuted) == "3 x 2 x Int64 in Rows (Permute, Sparse 4 (67%) [Int64])"
@assert brief(densify(permuted)) == "3 x 2 x Int64 in Rows (Permute, Dense)"

unpermuted = PermutedDimsArray(sparse, (1, 2))
@assert densify(unpermuted) == unpermuted
@assert brief(unpermuted) == "2 x 3 x Int64 in Columns (!Permute, Sparse 4 (67%) [Int64])"
@assert brief(densify(unpermuted)) == "2 x 3 x Int64 in Columns (!Permute, Dense)"

# LinearAlgebra

transposed = transpose(dense)
@assert densify(transposed) === transposed
@assert brief(transposed) == "4 x 3 x Float64 in Rows (Transpose, Dense)"

adjointed = adjoint(dense)
@assert densify(adjointed) === adjointed
@assert brief(adjointed) == "4 x 3 x Float64 in Rows (Adjoint, Dense)"

transposed = transpose(sparse)
@assert densify(transposed) == transposed
@assert brief(transposed) == "3 x 2 x Int64 in Rows (Transpose, Sparse 4 (67%) [Int64])"
@assert brief(densify(transposed)) == "3 x 2 x Int64 in Rows (Transpose, Dense)"

adjointed = adjoint(sparse)
@assert densify(adjointed) == adjointed
@assert brief(adjointed) == "3 x 2 x Int64 in Rows (Adjoint, Sparse 4 (67%) [Int64])"
@assert brief(densify(adjointed)) == "3 x 2 x Int64 in Rows (Adjoint, Dense)"

# output


using SparseArrays

# Sparse

sparse = SparseVector([0, 1, 2, 0])

@assert densify(sparse) == sparse
@assert brief(densify(sparse)) == "4 x Int64 (Dense)"

# Dense

dense = rand(4)
@assert densify(dense) === dense
@assert brief(dense) == "4 x Float64 (Dense)"

@assert densify(dense; copy = true) !== dense
@assert densify(dense; copy = true) == dense
@assert brief(densify(dense; copy = true)) == "4 x Float64 (Dense)"

@assert isapprox(densify(dense; eltype = Float32), dense)
@assert brief(densify(dense; eltype = Float32)) == "4 x Float32 (Dense)"

# output


TanayLabUtilities.MatrixFormats.bestify Function
bestify(
    matrix::AbstractMatrix;
    min_sparse_saving_fraction::AbstractFloat = ```0.25```,
    copy::Bool = false,
    eltype::Maybe{Type} = nothing,
)::AbstractMatrix

bestify(
    matrix::AbstractVector;
    min_sparse_saving_fraction::AbstractFloat = ```0.25```,
    copy::Bool = false,
    eltype::Maybe{Type} = nothing,
)::AbstractVector

Return a "best" (dense or sparse) version of an array. The sparse format is chosen if it saves at least min_sparse_saving_fraction of the storage of the dense format. If copy , this will create a copy even if it is already in the best format.

If eltype is specified, computes the savings (and create the "best" version) using this element type. In addition, if given a sparse matrix, we consider the indtype_for_size for it, and if that saves min_sparse_saving_fraction relative to the current sparse representation, we'll create a new one using the better (smaller) indtype .

using LinearAlgebra

# Dense

dense = zeros(Int32, 5, 5)
view(dense, diagind(dense)) .= 1

@assert bestify(dense) == dense
@assert brief(bestify(dense)) == "5 x 5 x Int32 in Columns (Sparse 5 (20%) [UInt32])"

@assert bestify(dense; min_sparse_saving_fraction = 0.5) === dense

# Sparse

sparse = sparse_matrix_csc(dense)
@assert bestify(sparse) === sparse
@assert brief(sparse) == "5 x 5 x Int32 in Columns (Sparse 5 (20%) [UInt32])"

# ReadOnly

read_only = read_only_array(dense)
@assert bestify(read_only; min_sparse_saving_fraction = 0.5) === read_only
@assert brief(read_only) == "5 x 5 x Int32 in Columns (ReadOnly, Dense)"

@assert bestify(read_only) == read_only
@assert brief(bestify(read_only)) == "5 x 5 x Int32 in Columns (ReadOnly, Sparse 5 (20%) [UInt32])"

read_only = read_only_array(sparse)
@assert bestify(read_only) === read_only
@assert brief(read_only) == "5 x 5 x Int32 in Columns (ReadOnly, Sparse 5 (20%) [UInt32])"

@assert bestify(read_only; min_sparse_saving_fraction = 0.5) == read_only
@assert brief(bestify(read_only; min_sparse_saving_fraction = 0.5)) == "5 x 5 x Int32 in Columns (ReadOnly, Dense)"

# Named

using NamedArrays

named = NamedArray(dense)
@assert bestify(named; min_sparse_saving_fraction = 0.5) === named
@assert brief(named) == "5 x 5 x Int32 in Columns (Named, Dense)"

@assert bestify(named) == named
@assert brief(bestify(named)) == "5 x 5 x Int32 in Columns (Named, Sparse 5 (20%) [UInt32])"

named = NamedArray(sparse)
@assert bestify(named) === named
@assert brief(named) == "5 x 5 x Int32 in Columns (Named, Sparse 5 (20%) [UInt32])"

@assert bestify(named; min_sparse_saving_fraction = 0.5) == named
@assert brief(bestify(named; min_sparse_saving_fraction = 0.5)) == "5 x 5 x Int32 in Columns (Named, Dense)"

# Permuted

permuted = PermutedDimsArray(dense, (2, 1))
@assert bestify(permuted; min_sparse_saving_fraction = 0.5) === permuted
@assert brief(permuted) == "5 x 5 x Int32 in Rows (Permute, Dense)"

@assert bestify(permuted) == permuted
@assert brief(bestify(permuted)) == "5 x 5 x Int32 in Rows (Permute, Sparse 5 (20%) [UInt32])"

permuted = PermutedDimsArray(sparse, (1, 2))
@assert bestify(permuted) === permuted
@assert brief(permuted) == "5 x 5 x Int32 in Columns (!Permute, Sparse 5 (20%) [UInt32])"

@assert bestify(permuted; min_sparse_saving_fraction = 0.5) == permuted
@assert brief(bestify(permuted; min_sparse_saving_fraction = 0.5)) == "5 x 5 x Int32 in Columns (!Permute, Dense)"

# LinearAlgebra

transposed = transpose(dense)
@assert bestify(transposed; min_sparse_saving_fraction = 0.5) === transposed
@assert brief(transposed) == "5 x 5 x Int32 in Rows (Transpose, Dense)"

@assert bestify(transposed) == transposed
@assert brief(bestify(transposed)) == "5 x 5 x Int32 in Rows (Transpose, Sparse 5 (20%) [UInt32])"

adjointed = adjoint(sparse)
@assert bestify(adjointed) === adjointed
@assert brief(adjointed) == "5 x 5 x Int32 in Rows (Adjoint, Sparse 5 (20%) [UInt32])"

@assert bestify(adjointed; min_sparse_saving_fraction = 0.5) == adjointed
@assert brief(bestify(adjointed; min_sparse_saving_fraction = 0.5)) == "5 x 5 x Int32 in Rows (Adjoint, Dense)"

# output


using LinearAlgebra

# Dense

dense = zeros(Int32, 3)
dense[1] = 1

@assert bestify(dense) == dense
@assert brief(bestify(dense)) == "3 x Int32 (Sparse 1 (33%) [UInt32])"

@assert bestify(dense; min_sparse_saving_fraction = 0.5) === dense

# Sparse

sparse = sparse_vector(dense)
@assert bestify(sparse) === sparse
@assert brief(sparse) == "3 x Int32 (Sparse 1 (33%) [UInt32])"

# output


TanayLabUtilities.MatrixFormats.indtype_for_size Function
indtype_for_size(size::Integer)::Type

Return the integer data type which is large enough to hold indices and offsets for a SparseMatrixCSC matrix of some size (total number of elements). We try to use UInt32 whenever possible because for large matrices (especially with 32-bit value types) this will drastically reduce the amount of space used.

println(10000000 => indtype_for_size(10000000))
println(10000000000 => indtype_for_size(10000000000))

# output

10000000 => UInt32
10000000000 => UInt64

TanayLabUtilities.MatrixFormats.colptr Function
colptr(sparse::AbstractMatrix)::AbstractVector{<:Integer}

Return the colptr of a sparse matrix.

using NamedArrays
using SparseArrays

sparse_matrix = SparseMatrixCSC([0 1 2; 3 4 0])
@assert colptr(sparse_matrix) === sparse_matrix.colptr;
@assert colptr(read_only_array(sparse_matrix)) === sparse_matrix.colptr;
@assert colptr(NamedArray(sparse_matrix)) === sparse_matrix.colptr;

# output


TanayLabUtilities.MatrixFormats.rowval Function
rowval(sparse::AbstractArray)::AbstractVector{<Integer}

Return the rowval of a sparse array.

using NamedArrays
using SparseArrays

sparse_matrix = SparseMatrixCSC([0 1 2; 3 4 0])
@assert rowval(sparse_matrix) === sparse_matrix.rowval;
@assert rowval(read_only_array(sparse_matrix)) === sparse_matrix.rowval;
@assert rowval(NamedArray(sparse_matrix)) === sparse_matrix.rowval;

# output


TanayLabUtilities.MatrixFormats.nzind Function
nzind(sparse::AbstractVector)::AbstractVector{<:Integer}

Return the nzind of a sparse vector.

using NamedArrays
using SparseArrays

sparse_vector = SparseVector([0, 1, 2])
@assert nzind(sparse_vector) === sparse_vector.nzind;
@assert nzind(read_only_array(sparse_vector)) === sparse_vector.nzind;
@assert nzind(NamedArray(sparse_vector)) === sparse_vector.nzind;

# output


TanayLabUtilities.MatrixFormats.nzval Function
nzval(sparse::AbstractArray)::AbstractVector

Return the nzval of a sparse array.

using NamedArrays
using SparseArrays

sparse_matrix = SparseMatrixCSC([0 1 2; 3 4 0])
@assert nzval(sparse_matrix) === sparse_matrix.nzval;
@assert nzval(read_only_array(sparse_matrix)) === sparse_matrix.nzval;
@assert nzval(NamedArray(sparse_matrix)) === sparse_matrix.nzval;

sparse_vector = SparseVector([0, 1, 2])
@assert nzval(sparse_vector) === sparse_vector.nzval;
@assert nzval(read_only_array(sparse_vector)) === sparse_vector.nzval;
@assert nzval(NamedArray(sparse_vector)) === sparse_vector.nzval;

# output


Index