Generalization, Combination and Extension of Functional Clustering Algorithms: The R Package funcy
Main Article Content
Abstract
Clustering functional data is mostly based on the projection of the curves onto an adequate basis and building random effects models of the basis coefficients. The parameters can be fitted with an EM algorithm. Alternatively, distance models based on the coefficients are used in the literature. Similar to the case of clustering multidimensional data, a variety of derivations of different models has been published. Although their calculation procedure is similar, their implementations are very different including distinct hyperparameters and data formats as input. This makes it difficult for the user to apply and particularly to compare them. Furthermore, they are mostly limited to specific basis functions. This paper aims to show the common elements between existing models in highly cited articles, first on a theoretical basis. Later their implementation is analyzed and it is illustrated how they could be improved and extended to a more general level. A special consideration is given to those models designed for sparse measurements. The work resulted in the R package funcy which was built to integrate the modified and extended algorithms into a unique framework.