基于卡普里尼量表和可解释的机器学习方法识别泌尿外科住院患者静脉血栓栓塞的关键危险因素。
Identification of key risk factors for venous thromboembolism in urological inpatients based on the Caprini scale and interpretable machine learning methods.
发表日期:2024 Aug 16
作者:
Chao Liu, Wei-Ying Yang, Fengmin Cheng, Ching-Wen Chien, Yen-Ching Chuang, Yanjun Jin
来源:
Disease Models & Mechanisms
摘要:
目的 采用可解释的机器学习方法,基于Caprini量表识别泌尿外科住院患者静脉血栓栓塞(VTE)的关键危险因素。根据案例医院的Caprini量表获得泌尿外科住院患者的VTE风险数据。在此基础上,利用Boruta方法进一步从Caprini量表的37个变量中筛选出关键变量。此外,使用粗糙集(RS)方法生成对应于每个风险级别的决策规则。最后利用随机森林(RF)、支持向量机(SVM)和反向传播人工神经网络(BPANN)验证数据准确性,并与RS方法进行比较。经过筛选,得出泌尿外科VTE的关键危险因素。分别是“(C1) 年龄”、“(C2) 计划进行小手术”、“(C3) 肥胖 (BMI > 25)”、“(C8) 静脉曲张”、“(C9) 脓毒症(< 1 个月)”( C10) “严重肺部疾病,包括肺炎(< 1 个月)” (C11) COPD、“(C16) 其他风险”、“(C18) 大手术(> 45 分钟)”、“(C19) 腹腔镜手术 (> 45 分钟)”分钟),”“(C20)患者卧床休息(> 72小时),”“(C18)恶性肿瘤(当前或既往)”,“(C23)中心静脉通路,”“(C31)DVT/PE病史, ” “(C32)其他先天性或后天性血栓形成倾向,”和“(C34)中风(< 1个月。”根据RS方法获得的不同风险级别的决策规则,“(C1)年龄”,“(C18)大手术(> 45分钟)”和“(C21)恶性肿瘤(当前或既往)”是影响中高风险水平的主要因素,并根据这三个因素提出了一些VTE预防建议。 RS、RF、SVM 和 BPANN 模型的平均准确率分别为 79.5%、87.9%、92.6% 和 97.2%。此外,BPANN 的准确率、召回率、F1 分数和准确率最高。RS 模型的准确率比其他三种常见的机器学习模型要差。然而,RS 模型提供了很强的可解释性,并允许识别影响泌尿科 VTE 高风险评估的高风险因素和决策规则。这种透明度对于临床医生在风险评估过程中非常重要。© 2024。作者。
To identify the key risk factors for venous thromboembolism (VTE) in urological inpatients based on the Caprini scale using an interpretable machine learning method.VTE risk data of urological inpatients were obtained based on the Caprini scale in the case hospital. Based on the data, the Boruta method was used to further select the key variables from the 37 variables in the Caprini scale. Furthermore, decision rules corresponding to each risk level were generated using the rough set (RS) method. Finally, random forest (RF), support vector machine (SVM), and backpropagation artificial neural network (BPANN) were used to verify the data accuracy and were compared with the RS method.Following the screening, the key risk factors for VTE in urology were "(C1) Age," "(C2) Minor Surgery planned," "(C3) Obesity (BMI > 25)," "(C8) Varicose veins," "(C9) Sepsis (< 1 month)," (C10) "Serious lung disease incl. pneumonia (< 1month) " (C11) COPD," "(C16) Other risk," "(C18) Major surgery (> 45 min)," "(C19) Laparoscopic surgery (> 45 min)," "(C20) Patient confined to bed (> 72 h)," "(C18) Malignancy (present or previous)," "(C23) Central venous access," "(C31) History of DVT/PE," "(C32) Other congenital or acquired thrombophilia," and "(C34) Stroke (< 1 month." According to the decision rules of different risk levels obtained using the RS method, "(C1) Age," "(C18) Major surgery (> 45 minutes)," and "(C21) Malignancy (present or previous)" were the main factors influencing mid- and high-risk levels, and some suggestions on VTE prevention were indicated based on these three factors. The average accuracies of the RS, RF, SVM, and BPANN models were 79.5%, 87.9%, 92.6%, and 97.2%, respectively. In addition, BPANN had the highest accuracy, recall, F1-score, and precision.The RS model achieved poorer accuracy than the other three common machine learning models. However, the RS model provides strong interpretability and allows for the identification of high-risk factors and decision rules influencing high-risk assessments of VTE in urology. This transparency is very important for clinicians in the risk assessment process.© 2024. The Author(s).