多组学数据整合聚类的统计方法。
Statistical Methods for Integrative Clustering of Multi-omics Data.
发表日期:2023
作者:
Prabhakar Chalise, Deukwoo Kwon, Brooke L Fridley, Qianxing Mo
来源:
Epigenetics & Chromatin
摘要:
癌症是由多级生物过程中的基因组学、表观遗传学、转录组学和蛋白质组学的累积突变或异常改变所引起的异质性疾病。识别癌症分子亚型以进行疾病预后和个体化医学方面的研究具有极大的临床兴趣。综合聚类是一种强大的无监督学习方法,它越来越多地被用于使用基因组图谱、DNA副本数、DNA甲基化和基因表达等多组学数据识别癌症分子亚型。综合聚类方法一般分为基于模型的和非参数方法。在本章中,我们将概述常用的基于模型的方法,包括iCluster、iClusterPlus和iClusterBayes,以及非参数方法综合非负矩阵分解(intNMF)。我们将使用葡萄膜黑色素瘤和低等级胶质瘤的综合分析来说明这些代表性方法。最后,我们将讨论这些代表性方法的优点和局限性,并提出实践中进行癌症多组学数据综合分析的建议。©2023年。作者(们)独家许可Springer Science+Business Media,LLC和Springer Nature合作。
Cancers are heterogeneous diseases caused by accumulated mutations or abnormal alterations at multi-levels of biological processes including genomics, epigenomics, transcriptomics, and proteomics. There is a great clinical interest in identifying cancer molecular subtypes for disease prognosis and personalized medicine. Integrative clustering is a powerful unsupervised learning method that has been increasingly used to identify cancer molecular subtypes using multi-omics data including somatic mutations, DNA copy numbers, DNA methylation, and gene expression. Integrative clustering methods are generally classified into model-based or nonparametric approaches. In this chapter, we will give an overview of the frequently used model-based methods, including iCluster, iClusterPlus, and iClusterBayes, and the nonparametric method, integrative nonnegative matrix factorization (intNMF). We will use the integrative analyses of uveal melanoma and lower-grade glioma to illustrate these representative methods. Finally, we will discuss the strengths and limitations of these representative methods and give suggestions for performing integrative analyses of cancer multi-omics data in practice.© 2023. The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature.