This function takes a matrix and downsamples it to a target number of samples. It uses a random seed for reproducibility and allows for removing columns with small sums.

downsample_matrix(
  mat,
  target_n = NULL,
  target_q = NULL,
  seed = NULL,
  remove_columns = FALSE
)

Arguments

mat

An integer matrix to be downsampled. Can be a matrix or sparse matrix (dgCMatrix). If the matrix contains NAs, the function will run significantly slower. Values that are not integers will be coerced to integers using floor().

target_n

The target number of samples to downsample to.

target_q

A target quantile of sums to downsample to. Only one of 'target_n' or 'target_q' can be provided.

seed

The random seed for reproducibility (default is NULL)

remove_columns

Logical indicating whether to remove columns with small sums (default is FALSE)

Value

The downsampled matrix

Examples

# \dontshow{
# this line is only for CRAN checks
tglkmeans.set_parallel(1)
# }

mat <- matrix(1:12, nrow = 4)
downsample_matrix(mat, 2)
#> ! No seed provided. Using 1540.
#>      [,1] [,2] [,3]
#> [1,]    1    1    1
#> [2,]    0    0    0
#> [3,]    1    1    1
#> [4,]    0    0    0

# Remove columns with small sums
downsample_matrix(mat, 12, remove_columns = TRUE)
#> ! No seed provided. Using 8203.
#>  Removed 1 columns with a sum smaller than 12.
#>      [,1] [,2]
#> [1,]    0    1
#> [2,]    3    4
#> [3,]    3    2
#> [4,]    6    5

# sparse matrix
mat_sparse <- Matrix::Matrix(mat, sparse = TRUE)
downsample_matrix(mat_sparse, 2)
#> ! No seed provided. Using 9883.
#> 4 x 3 sparse Matrix of class "dgCMatrix"
#>           
#> [1,] 0 0 2
#> [2,] 0 0 0
#> [3,] 2 0 0
#> [4,] 0 2 0

# with a quantile
downsample_matrix(mat, target_q = 0.5)
#>  Using 26 as the target number (the 0.5 quantile of the column sums).
#> ! No seed provided. Using 7933.
#> Warning: 1 columns have a sum smaller than 26. These columns were not changed. To remove
#> them, set remove_columns=TRUE.
#>      [,1] [,2] [,3]
#> [1,]    1    5    5
#> [2,]    2    6    9
#> [3,]    3    7    7
#> [4,]    4    8    5