研究动态
Articles below are published ahead of final publication in an issue. Please cite articles in the following format: authors, (year), title, journal, DOI.

利用智能算法和古巴女性的风险因素评估乳腺癌风险。

Breast cancer risk estimation with intelligent algorithms and risk factors for Cuban women.

发表日期:2024 Jul 10
作者: Jose Manuel Valencia-Moreno, Jose Angel Gonzalez-Fraga, Everardo Gutierrez-Lopez, Vivian Estrada-Senti, Hugo Alexis Cantero-Ronquillo, Vitaly Kober
来源: COMPUTERS IN BIOLOGY AND MEDICINE

摘要:

乳腺癌是最常见的恶性肿瘤,也是全球女性癌症死亡的主要原因。目前基于风险因素的预测模型在特定人群中效率低下,因此针对古巴女性的适当且经过校准的乳腺癌预测模型至关重要。本文提出了一个使用机器学习算法和风险因素评估古巴女性乳腺癌风险的概念模型。该模型具有三个主要组成部分:知识表示、风险估计建模和风险预测器评估。九种最常见的机器学习算法被用来使用所提出的模型生成风险预测器。有两个数据源作为案例研究:第一个包含从古巴女性收集的数据,第二个包含从乳腺癌监测联盟数据集中获得的美国西班牙裔女性的数据。结果表明,该模型有效地估计了乳腺癌风险,可以成为早期检测乳腺癌和识别高风险患者的宝贵工具。根据第一个实验结果,古巴女性人群乳腺癌风险的最佳预测器对应的随机森林算法,加权得分为5.981,训练精度为0.996,训练AUC为0.997。在第二个实验中,结果表明,与使用美国西班牙裔人口生成的预测变量相比,使用古巴女性数据生成的模型所生成的风险预测变量获得了更好的 AUC 和准确度值,并且有可能推广到其他西班牙裔群体。在古巴等拉丁美洲国家,实施该模型可能是降低此类癌症死亡率的经济上可行的替代方案。版权所有 © 2024 作者。由爱思唯尔有限公司出版。保留所有权利。
Breast cancer is the most common malignant neoplasm and the leading cause of cancer mortality among women globally. Current prediction models based on risk factors are inefficient in specific populations, so an appropriate and calibrated breast cancer prediction model for Cuban women is essential. This article proposes a conceptual model for breast cancer risk estimation for Cuban women using machine learning algorithms and risk factors. The model has three main components: knowledge representation, risk estimation modeling, and risk predictor evaluation. Nine of the most common machine learning algorithms were used to generate risk predictors using the proposed model. Two data sources served as case studies: the first comprised data collected from Cuban women, and the second included data from US Hispanic women obtained from the Breast Cancer Surveillance Consortium dataset. The results show that the model effectively estimates breast cancer risk and could be a valuable tool for early detection of breast cancer and identification of patients at risk. According to the first experiment results, the best predictor of breast cancer risk for the Cuban female population corresponds to the Random Forest algorithm with a weighted score of 5.981, a training accuracy of 0.996 and a training AUC of 0.997. In a second experiment, it was demonstrated that the risk predictors generated by the proposed model using data from Cuban women obtained better AUC and accuracy values compared to the predictors generated by using the US Hispanic population, potentially generalizable to other Hispanic populations. Implementing this model could be an economically viable alternative to reduce the mortality rate of this type of cancer in Latin American countries such as Cuba.Copyright © 2024 The Author(s). Published by Elsevier Ltd.. All rights reserved.