This function takes a matrix and downsamples it to a target number of samples. It uses a random seed for reproducibility and allows for removing columns with small sums.
downsample_matrix(
mat,
target_n = NULL,
target_q = NULL,
seed = NULL,
remove_columns = FALSE
)
An integer matrix to be downsampled. Can be a matrix or sparse matrix (dgCMatrix).
If the matrix contains NAs, the function will run significantly slower. Values that are
not integers will be coerced to integers using floor()
.
The target number of samples to downsample to.
A target quantile of sums to downsample to. Only one of 'target_n' or 'target_q' can be provided.
The random seed for reproducibility (default is NULL)
Logical indicating whether to remove columns with small sums (default is FALSE)
The downsampled matrix
# \dontshow{
# this line is only for CRAN checks
tglkmeans.set_parallel(1)
# }
mat <- matrix(1:12, nrow = 4)
downsample_matrix(mat, 2)
#> ! No seed provided. Using 1540.
#> [,1] [,2] [,3]
#> [1,] 1 1 1
#> [2,] 0 0 0
#> [3,] 1 1 1
#> [4,] 0 0 0
# Remove columns with small sums
downsample_matrix(mat, 12, remove_columns = TRUE)
#> ! No seed provided. Using 8203.
#> ℹ Removed 1 columns with a sum smaller than 12.
#> [,1] [,2]
#> [1,] 0 1
#> [2,] 3 4
#> [3,] 3 2
#> [4,] 6 5
# sparse matrix
mat_sparse <- Matrix::Matrix(mat, sparse = TRUE)
downsample_matrix(mat_sparse, 2)
#> ! No seed provided. Using 9883.
#> 4 x 3 sparse Matrix of class "dgCMatrix"
#>
#> [1,] 0 0 2
#> [2,] 0 0 0
#> [3,] 2 0 0
#> [4,] 0 2 0
# with a quantile
downsample_matrix(mat, target_q = 0.5)
#> ℹ Using 26 as the target number (the 0.5 quantile of the column sums).
#> ! No seed provided. Using 7933.
#> Warning: 1 columns have a sum smaller than 26. These columns were not changed. To remove
#> them, set remove_columns=TRUE.
#> [,1] [,2] [,3]
#> [1,] 1 5 5
#> [2,] 2 6 9
#> [3,] 3 7 7
#> [4,] 4 8 5