K-Means
TanayLabUtilities.KMeans
—
Module
Higher-level K-Means functions.
TanayLabUtilities.KMeans.KMeansBuffers
—
Type
KMeansBuffers{T}(; n_dims::Integer, max_k::Integer, n_points::Integer)::KMeansBuffers where {T<: AbstractFloat}
KMeansBuffers(buffers::KMeansBuffers{T}; n_dims::Integer, k::Integer, n_points::Integer) where {T <: AbstractFloat}
Pre-allocate buffers for allocation-free K-Means computation. Create a smaller view sharing the storage with
KMeansBuffers(buffers; n_dims, k, n_points)
.
TanayLabUtilities.KMeans.KmeansResultView
—
Type
KmeansResultView
A view into
KMeansBuffers
that provides the same accessor interface as
Clustering.KmeansResult
(
assignments
,
counts
,
wcounts
,
nclusters
,
totalcost
) without allocating.
TanayLabUtilities.KMeans.kmeans_in_buffers
—
Function
kmeans_in_buffers(
X::AbstractMatrix{<:Real},
k::Integer;
buffers::Maybe{KMeansBuffers} = ```nothing```,
maxiter::Integer = ```100```,
tol::Real = ```1.0e-6```,
distance::SemiMetric = ```Distances.SqEuclidean```,
rng::AbstractRNG = default_rng(),
)::Union{KmeansResult, KmeansResultView}
Same as
kmeans
, but if
buffers
are specified, run allocation-free code. Seeding is restricted to
:kmpp
. Implementation is otherwise identical to `Clustering.kmeans.
TanayLabUtilities.KMeans.kmeans_in_buffers!
—
Function
kmeans_in_buffers!(
X::AbstractMatrix{<:Real},
centers::AbstractMatrix{<:AbstractFloat};
buffers::Maybe{KMeansBuffers} = ```nothing```,
maxiter::Integer = ```100```,
tol::Real = ```1.0e-6```,
distance::SemiMetric = ```Distances.SqEuclidean```,
rng::AbstractRNG = default_rng(),
)::Union{KmeansResult, KmeansResultView}
Same as
kmeans!
, but if
buffers
are specified, run allocation-free code. Seeding is restricted to
:kmpp
. Implementation is otherwise identical to `Clustering.kmeans!.
TanayLabUtilities.KMeans.kmeans_in_rounds
—
Function
kmeans_in_rounds(
values_of_points::AbstractMatrix{<:AbstractFloat},
k::Integer;
centers::Maybe{AbstractMatrix{<:AbstractFloat}} = ```nothing```,
buffers::Maybe{Tuple{KMeansBuffers, KMeansBuffers}} = ```nothing```,
rounds::Integer = ```10```,
rng::AbstractRNG = default_rng(),
)::Union{KmeansResult, KmeansResultView}
Run
kmeans
multiple times with different random seeds (using
rng
) and return the best results. This is needed because K-Means is a heuristic and tends to occasionally get stuck in a local minimum.
If
buffers
are specified, runs allocation-free using
kmeans_in_buffers
/
kmeans_in_buffers!
. Otherwise (the default), falls back to using
Clustering.kmeans
/
Clustering.kmeans!
.