使用自适应密度峰检测进行多元函数数据聚类。
Multivariate functional data clustering using adaptive density peak detection.
发表日期:2023 Feb 24
作者:
Rui Ren, Kuangnan Fang, Qingzhao Zhang, Xiaofeng Wang
来源:
STATISTICS IN MEDICINE
摘要:
多元函数数据的聚类是一个具有挑战性的问题,因为这些数据是由一组曲线和属于无限维空间的函数表示的。在本文中,我们提出了一种新颖的多元函数数据聚类方法,使用自适应密度峰检测技术。它是一种快速聚类中心识别算法,基于每个函数数据观测值的两个度量:函数密度估计和到最近具有更高函数密度的观测值的距离。我们针对多元函数数据提出了两种类型的函数密度估计器。第一种是基于原始函数曲线的L2距离或多元函数主成分的半度量的函数 k $$ k $$ 最近邻密度估计器。第二种是基于多元函数主分数的函数 k $$ k $$ 最近邻密度估计器。我们的聚类方法计算速度快,因为它不需要迭代过程。通过与其他现有聚类方法在模拟研究中的比较,考虑了方法的灵活性和优势。我们还开发了一个用户友好的R软件包FADPclust,供公众使用。最后,我们的方法应用于肺癌研究的真实案例研究中。 © 2023 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.
Clustering for multivariate functional data is a challenging problem since the data are represented by a set of curves and functions belonging to an infinite-dimensional space. In this article, we propose a novel clustering method for multivariate functional data using an adaptive density peak detection technique. It is a quick cluster center identification algorithm based on the two measures of each functional data observation: the functional density estimate and the distance to the closest observation with a higher functional density. We suggest two types of functional density estimators for multivariate functional data. The first one is a functional k $$ k $$ -nearest neighbor density estimator based on (a) an L2 distance between raw functional curves, or (b) a semimetric of multivariate functional principal components. The second one is a k $$ k $$ -nearest neighbor density estimator based on multivariate functional principal scores. Our clustering method is computationally fast since it does not need an iterative process. The flexibility and advantages of the method are examined by comparing it with other existing clustering methods in simulation studies. A user-friendly R package FADPclust is developed for public use. Finally, our method is applied to a real case study in lung cancer research.© 2023 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.