‘tgstat’ package - Tanay’s group statistical utilities

Package Configuration

Use of BLAS Library

tgstat aims to optimize and expand some of R’s common functionality. Various approaches are used to achieve this goal, including the use of optimized functions provided by the Basic Linear Algebra Subprograms (BLAS) library. Many optimized BLAS implementations are available, both proprietary (e.g. Intel’s MKL, Apple’s vecLib) and opensource (e.g. OpenBLAS, ATLAS). Unfortunately, R often uses by default the reference BLAS implementation, which is known to have poor performance.

tgstat runs best when R is linked with an optimized BLAS implementation. However, having tgstat rely on the reference BLAS will result in very poor performance and is strongly discouraged. If your R implementation uses an optimized BLAS, set options(tgs_use.blas=TRUE) to allow tgstat to make BLAS calls. Otherwise, set options(tgs_use.blas=FALSE) (default) which instructs tgstat to avoid BLAS and instead rely only on its own optimization methods. If in doubt, it is possible to run one of tgstat’s CPU intensive functions (e.g. tgs_cor) and compare its run time under both options(tgs_use.blas=FALSE).

Exact instructions for linking R with an optimized BLAS library are system dependent and are out of scope of this document.

Multitasking

tgstat uses multitasking to speed up performance. The number of concurrent running processes depends on the task and the number of CPU cores available on the specific machine. It is also possible to manually set a hard limit for the maximal number of processes used by options(tgs_max.processes=...).