Elementary Cluster Analysis: Four Basic Methods that (Usually) Work

Elementary Cluster Analysis: Four Basic Methods that (Usually) Work

English | 2022 | ISBN: ‎978-8770224253 | 550 Pages | PDF | 207 MB

Elementary Cluster Analysis: Four Basic Methods that (Usually) Work (River Publishers Series in Mathematics, Statistics and Computational Modelling)

The availability of packaged clustering programs means that anyone with data can easily do cluster analysis on it. But many users of this technology don’t fully appreciate its many hidden dangers. In today’s world of “grab and go algorithms,” cluster analysis is very much an art as well as a science, and it is easy to stumble if due to not understanding its pitfalls. Even if you are familiar with them, it is easy to make mistakes. The parenthetical word usually in the title is important, because all clustering algorithms can and do fail from time to time.

Modern cluster analysis has become so technically intricate that it is often hard for the beginner or the non-specialist to appreciate and understand its many hidden dangers. This book describes four classical methods for clustering in small, static data sets that have all withstood the tests of time. The youngest of the four methods is now almost 50 years old:

  • Gaussian Mixture Decomposition (GMD, 1898)
  • SAHN Clustering (principally single linkage (SL, 1909)
  • Hard c-means (HCM, 1956, also widely known as (aka) “k-means”)
  • Fuzzy c-means (FCM, 1973, reduces to HCM in a certain limit)

Cluster analysis is a vast topic. The overall picture in clustering is quite overwhelming, so any attempt to swim at the deep end of the pool in even a very specialized subfield requires a lot of training. This book is aimed squarely at beginners and those who are still new to the field.