Sergio Gómez   
Departament d'Enginyeria Informàtica i Matemàtiques
Universitat Rovira i Virgili

MultiDendrograms

A hierarchical clustering tool

    md00.png

Index

Description

MultiDendrograms is a simple yet powerful program to make the Hierarchical Clustering of real data, distributed under an Open Source license. Starting from a distances (or similarities) matrix, MultiDendrograms calculates its dendrogram using the most common Agglomerative Hierarchical Clustering algorithms, allows the tuning of many of the graphical representation parameters, and the results may be easily exported to file. A summary of characteristics:

MultiDendrograms implements the variable-group algorithms in [1] to solve the non-uniqueness problem found in the standard pair-group algorithms and implementations. This problem arises when two or more minimum distances between different clusters are equal during the amalgamation process. The standard approach consists in choosing a pair, breaking the ties between distances, and proceeds in the same way until the final hierarchical classification is obtained. However, different clusterings are possible depending on the criterion used to break the ties (usually a pair is just chosen at random!), and the user is unaware of this problem.

The variable-group algorithms group more than two clusters at the same time when ties occur, given rise to a graphical representation called multidendrogram. Their main properties are:

MultiDendrograms also introduces a new parameterized type of hierarchical clustering algorithm called Versatile Linkage [2], which includes Single Linkage, Complete Linkage and Arithmetic Linkage as particular cases, and which naturally defines two new algorithms, Geometric Linkage and Harmonic Linkage (hence the convenience to rename UPGMA as Arithmetic Linkage, to emphasize the existence of different types of averages).

Similar functionality can also be obtained using package mdendro for the R language.

Comparison with other applications

How do other applications deal with ties?

How do I know if there are ties in my data?

How many binary dendrograms may correspond to one MultiDendrogram?

References

[1] Solving Non-uniqueness in Agglomerative Hierarchical Clustering Using Multidendrograms
Alberto Fernández and Sergio Gómez
Journal of Classification 25 (2008) 43-65
(view) (pdf) (doi) (Springer Nature)
[2] Versatile linkage: a family of space-conserving strategies for agglomerative hierarchical clustering
Alberto Fernández and Sergio Gómez
Journal of Classification 37 (2020) 584-597
(view) (pdf) (doi) (Springer Nature)

Download

Please cite [1] if you use MultiDendrograms in your publications, and [2] if you use Versatile Linkage:

Alternatively, you may use mdendro, a package for the R language, or the Hierarchical_Clustering program in Radatools, which is able to calculate MultiDendrograms and also to enumerate or count the corresponding Binary Dendrograms.

Requirements

To run MultiDendrograms it is necessary to have installed a recent version of the Java Runtime Environment (JRE) or the Java Development Kit (JDK). The minimum version of Java needed is Java 8, but it is recommended to use Java 12 or higher. From Java 9 above, the application is scaled using the system scaling, thus avoiding problems with small fonts or windows when using very high resolution (e.g., 4K) screens. You can check if Java is already in your computer and its version following these steps:

We recommend the installation of OpenJDK instead of the Oracle versions of Java, due to the changes in license. There are several options to install OpenJDK 12:

Installation

No installation needed, just unzip multidendrograms-xxx.zip and run multidendrograms.bat (Windows), multidendrograms.sh (Linux) or multidendrograms.jar (all OS).

Development

You may contribute to the development of MultiDendrograms in GitHub:

Gallery

MultiDendrograms snapshot 1    MultiDendrograms snapshot 2    MultiDendrograms snapshot 3

MultiDendrograms snapshot 4    MultiDendrograms snapshot 5    MultiDendrograms snapshot 6

History

MultiDendrograms 5.2:

MultiDendrograms 5.1:

MultiDendrograms 5.0:

MultiDendrograms 4.1:

MultiDendrograms 4.0:

MultiDendrograms 3.2:

MultiDendrograms 3.1:

MultiDendrograms 3.0:

MultiDendrograms 2.1:

MultiDendrograms 2.0:

MultiDendrograms 1.0:

Authors

Alberto Fernández:

Sergio Gómez: