GitHub version CRAN version downloads

Introduction

Package hclust1d (Hierarchical CLUSTering for 1D) is a suit of algorithms for univariate agglomerative hierarchical clustering (with a comprehensive list of choices of a linkage function, please consult supported_methods for the current list) in \(\mathcal{O}(n\log n)\) time.

The better algorithmic time complexity (compared to multidimensional hierarchical clustering) paired with its efficient C++ implementation make hclust1d very fast. The computational time beats stats::hclust on all sizes of data and is en par with fastcluster::hclust with small data sizes. However, it is of orders of magnitude faster than both multivariate clustering routines on larger data sizes.

The output of hclust1d is of the same S3 class and format as the outputs of stats::hclust or fastcluster::hclust and thus the resulting clustering can be further investigated with standard calls to print, plot (plots a dendrogram), etc. In fact, for 1D cases the call to hclust can be simply replaced by a call to hclust1d in a plug-and-play manner, with the surrounding code unchanged. The how-to is covered in detail in our replacing stats::hclust vignette

For information on how to get started using hclust1d, see our getting started vignette.

Installing hclust1d package

To install the development package version please execute

library(devtools)
devtools::install_github("SzymonNowakowski/hclust1d")

Alternatively, to install the current stable CRAN version please execute

install.packages("hclust1d")

After that, you can load the installed package into memory with a call to library(hclust1d).