Model-based Methods of Classification: Using the mclust Software in Chemometrics
Main Article Content
Abstract
Due to recent advances in methods and software for model-based clustering, and to the interpretability of the results, clustering procedures based on probability models are increasingly preferred over heuristic methods. The clustering process estimates a model for the data that allows for overlapping clusters, producing a probabilistic clustering that quantifies the uncertainty of observations belonging to components of the mixture. The resulting clustering model can also be used for some other important problems in multivariate analysis, including density estimation and discriminant analysis. Examples of the use of model-based clustering and classification techniques in chemometric studies include multivariate image analysis, magnetic resonance imaging, microarray image segmentation, statistical process control, and food authenticity. We review model-based clustering and related methods for density estimation and discriminant analysis, and show how the R package mclust can be applied in each instance.