Skip to contents

Cluster atac peaks based on atac distributions

Usage

gen_atac_peak_clust(
  atac_mc,
  k = NULL,
  clustering_algoritm = "kmeans",
  cluster_on = "fp",
  peak_set = NULL,
  ...
)

Arguments

atac_mc

a McPeaks object

k

number of clusters; must be specified if clustering_algoritm = 'kmeans'

clustering_algoritm

(optional) either "kmeans" or "louvain"

cluster_on

(optional; default - fp) which matrix (mat/fp/egc)to cluster on

...

Arguments passed on to tglkmeans::TGL_kmeans

df

a data frame or a matrix. Each row is a single observation and each column is a dimension. the first column can contain id for each observation (if id_column is TRUE), otherwise the rownames are used.

metric

distance metric for kmeans++ seeding. can be 'euclid', 'pearson' or 'spearman'

max_iter

maximal number of iterations

min_delta

minimal change in assignments (fraction out of all observations) to continue iterating

verbose

display algorithm messages

keep_log

keep algorithm messages in 'log' field

id_column

df's first column contains the observation id

reorder_func

function to reorder the clusters. operates on each center and orders by the result. e.g. reorder_func = mean would calculate the mean of each center and then would reorder the clusters accordingly. If reorder_func = hclust the centers would be ordered by hclust of the euclidean distance of the correlation matrix, i.e. hclust(dist(cor(t(centers)))) if NULL, no reordering would be done.

hclust_intra_clusters

run hierarchical clustering within each cluster and return an ordering of the observations.

seed

seed for the c++ random number generator

parallel

cluster every cluster parallelly (if hclust_intra_clusters is true)

use_cpp_random

use c++ random number generator instead of R's. This should be used for only for backwards compatibility, as from version 0.4.0 onwards the default random number generator was changed o R.

Value

a named numeric vector specifying the cluster for each peak

Examples

if (FALSE) {
my_atac_mc <- gen_atac_peak_clust(my_atac_mc, k = 16, cluster_on = "mat")

dyn_p <- identify_dynamic_peaks(my_atac_mc)
my_atac_mc <- gen_atac_peak_clust(my_atac_mc, k = 16, cluster_on = "fp", peak_set = dyn_p)
}