This function takes a matrix and downsamples it to a target number of samples. It uses a random seed for reproducibility and allows for removing columns with small sums.
downsample_matrix(
mat,
target_n = NULL,
target_q = NULL,
seed = NULL,
remove_columns = FALSE
)An integer matrix to be downsampled. Can be a matrix or sparse matrix (dgCMatrix).
If the matrix contains NAs, the function will run significantly slower. Values that are
not integers will be coerced to integers using floor().
The target number of samples to downsample to.
A target quantile of sums to downsample to. Only one of 'target_n' or 'target_q' can be provided.
The random seed for reproducibility (default is NULL)
Logical indicating whether to remove columns with small sums (default is FALSE)
The downsampled matrix
mat <- matrix(1:12, nrow = 4)
downsample_matrix(mat, 2)
#> ! No seed provided. Using 1540.
#> [,1] [,2] [,3]
#> [1,] 1 1 1
#> [2,] 0 0 0
#> [3,] 1 1 1
#> [4,] 0 0 0
# Remove columns with small sums
downsample_matrix(mat, 12, remove_columns = TRUE)
#> ! No seed provided. Using 8203.
#> ℹ Removed 1 columns with a sum smaller than 12.
#> [,1] [,2]
#> [1,] 0 1
#> [2,] 3 4
#> [3,] 3 2
#> [4,] 6 5
# sparse matrix
mat_sparse <- Matrix::Matrix(mat, sparse = TRUE)
downsample_matrix(mat_sparse, 2)
#> ! No seed provided. Using 9883.
#> 4 x 3 sparse Matrix of class "dgCMatrix"
#>
#> [1,] 0 0 2
#> [2,] 0 0 0
#> [3,] 2 0 0
#> [4,] 0 2 0
# with a quantile
downsample_matrix(mat, target_q = 0.5)
#> ℹ Using 26 as the target number (the 0.5 quantile of the column sums).
#> ! No seed provided. Using 7933.
#> Warning: 1 columns have a sum smaller than 26. These columns were not changed. To remove
#> them, set remove_columns=TRUE.
#> [,1] [,2] [,3]
#> [1,] 1 5 5
#> [2,] 2 6 9
#> [3,] 3 7 7
#> [4,] 4 8 5