癌症预后和预测中的数据挖掘与数学模型。
Data mining and mathematical models in cancer prognosis and prediction.
发表日期:2022 Jun
作者:
Chong Yu, Jin Wang
来源:
Disease Models & Mechanisms
摘要:
癌症是一种胎儿化的复杂疾病。同一癌症类型或同一患者在不同癌症发展阶段的个体差异可能需要不同的治疗方法。病理学上的差异反映在组织、细胞和基因水平等方面。癌细胞与周围微环境之间的相互作用也会影响癌症的进展和转移。理解这些机制和定量地研究它们是一个巨大的挑战。研究人员应用了模式识别算法,如机器学习或数据挖掘,来预测癌症类型或分类。随着计算能力的迅速增长和可用性的提高,研究人员开始整合大数据集、多维数据类型和信息。细胞受基因表达的控制,这是通过启动子序列和转录调节因子确定的。例如,通过这些底层机制引起的基因表达的变化可以修改细胞在细胞周期中的进展。这种分子活动可以通过底层基因调控网络来进行调控,当信息和基因调控可见可得时,这对于癌症研究至关重要。在本综述中,我们简要介绍了几种用于癌症预测和分类的机器学习方法,包括人工神经网络(ANNs)、决策树(DTs)、支持向量机(SVM)和朴素贝叶斯。然后,我们描述了几种建立基因调控网络的典型模型,如相关性、回归和贝叶斯方法,这些模型是基于可用数据的。这些方法可以帮助癌症诊断,如易感性、复发、生存等。最后,我们总结和比较了分析基因调控网络来研究癌症的发展和进展的建模方法。这些模型可以提供通过基因调控网络以系统而定量的方式分析癌症进展的可能物理策略。© 2022 作者,由德固特出版,柏林/波士顿发表。
Cancer is a fetal and complex disease. Individual differences of the same cancer type or the same patient at different stages of cancer development may require distinct treatments. Pathological differences are reflected in tissues, cells and gene levels etc. The interactions between the cancer cells and nearby microenvironments can also influence the cancer progression and metastasis. It is a huge challenge to understand all of these mechanistically and quantitatively. Researchers applied pattern recognition algorithms such as machine learning or data mining to predict cancer types or classifications. With the rapidly growing and available computing powers, researchers begin to integrate huge data sets, multi-dimensional data types and information. The cells are controlled by the gene expressions determined by the promoter sequences and transcription regulators. For example, the changes in the gene expression through these underlying mechanisms can modify cell progressing in the cell-cycle. Such molecular activities can be governed by the gene regulations through the underlying gene regulatory networks, which are essential for cancer study when the information and gene regulations are clear and available. In this review, we briefly introduce several machine learning methods of cancer prediction and classification which include Artificial Neural Networks (ANNs), Decision Trees (DTs), Support Vector Machine (SVM) and naive Bayes. Then we describe a few typical models for building up gene regulatory networks such as Correlation, Regression and Bayes methods based on available data. These methods can help on cancer diagnosis such as susceptibility, recurrence, survival etc. At last, we summarize and compare the modeling methods to analyze the development and progression of cancer through gene regulatory networks. These models can provide possible physical strategies to analyze cancer progression in a systematic and quantitative way.© 2022 the author(s), published by De Gruyter, Berlin/Boston.