使用机器学习算法预测间叠性乳腺癌患者的乳腺癌特定生存率。
Predicting breast cancer-specific survival in metaplastic breast cancer patients using machine learning algorithms.
发表日期:2023
作者:
Yufan Feng, Natasha McGuire, Alexandra Walton, , Stephen Fox, Antonella Papa, Sunil R Lakhani, Amy E McCart Reed
来源:
BIOMEDICINE & PHARMACOTHERAPY
摘要:
转化手法癌(MpBC)是乳腺癌的一种罕见且侵袭性的亚型,有关预后因素和生存预测的数据正在涌现。本研究旨在建立机器学习模型,预测MpBC患者的乳腺癌特异性生存(BCSS),利用了包括临床、病理和生物学变量在内的160名患者的数据集。使用增益比和基于相关性的方法进行深入的变量选择过程,最终得到了10个用于模型估计的变量。使用10折交叉验证评估了五个模型(带装袋的决策树;逻辑回归;多层感知机;朴素贝叶斯;和随机森林算法)。尽管没有治疗信息的限制,随机森林模型在预测BCSS方面表现出最高的性能,ROC曲线下面积为0.808。本研究强调了机器学习算法在使用临床数据集预测复杂和异质癌亚型的预后方面的潜力,并且它们对患者管理的潜力。进一步的研究可以结合更多的变量,如治疗反应和更先进的机器学习技术,有可能增强MpBC预后模型的预测能力。© 2023 作者。
Metaplastic breast cancer (MpBC) is a rare and aggressive subtype of breast cancer, with data emerging on prognostic factors and survival prediction. This study aimed to develop machine learning models to predict breast cancer-specific survival (BCSS) in MpBC patients, utilizing a dataset of 160 patients with clinical, pathological, and biological variables. An in-depth variable selection process was carried out using gain ratio and correlation-based methods, resulting in 10 variables for model estimation. Five models (decision tree with bagging; logistic regression; multilayer perceptron; naïve Bayes; and, random forest algorithms) were evaluated using 10-fold cross-validation. Despite the constraints posed by the absence of therapeutic information, the random forest model exhibited the highest performance in predicting BCSS, with an ROC area of 0.808. This study emphasizes the potential of machine learning algorithms in predicting prognosis for complex and heterogeneous cancer subtypes using clinical datasets, and their potential to contribute to patient management. Further research that incorporates additional variables, such as treatment response, and more advanced machine learning techniques will likely enhance the predictive power of MpBC prognostic models.© 2023 The Authors.