使用诊断代码和深度学习技术识别关键疾病。
Identifying Disease of Interest With Deep Learning Using Diagnosis Code.
发表日期:2023 Mar 20
作者:
Yoon-Sik Cho, Eunsun Kim, Patrick L Stafford, Min-Hwan Oh, Younghoon Kwon
来源:
Disease Models & Mechanisms
摘要:
Autoencoder(AE)是一种深度学习技术,它使用人工神经网络在输出层中重构其输入数据。我们构建了一种新颖的有监督AE模型,并在仅使用诊断代码预测感兴趣疾病的共存方面测试了其性能。韩国2019年国家卫生信息数据库中列出的100万名随机抽样患者的诊断代码用于训练、验证和测试预测模型。第一个模型仅使用AE作为分类器输入的特征工程工具。添加了有监督多层感知机(sMLP)来训练分类器,以预测具有潜在表示的二进制级别(AE + sMLP)。第二个模型在学习过程中同时更新了AE和连接的MLP分类器的参数(端到端监督AE [EEsAE])。我们测试了这两个模型针对基准模型(极限梯度提升和朴素贝叶斯)在预测共存胃癌诊断方面的表现。所提出的EEsAE模型产生了最高的F1分数和最高的曲线下面积(0.86)。EEsAE和AE + sMLP具有最高的召回率。XGB产生了最高的精确度。消融研究表明,缺铁性贫血、胃食管反流病、原发性高血压、胃溃疡、良性前列腺增生和肩部损伤是性能上最有影响力的前6个诊断。一种新颖的EEsAE模型在预测感兴趣疾病方面表现出有希望的性能。 © 2023韩国医学科学院。
Autoencoder (AE) is one of the deep learning techniques that uses an artificial neural network to reconstruct its input data in the output layer. We constructed a novel supervised AE model and tested its performance in the prediction of a co-existence of the disease of interest only using diagnostic codes.Diagnostic codes of one million randomly sampled patients listed in the Korean National Health Information Database in 2019 were used to train, validate, and test the prediction model. The first used AE solely for a feature engineering tool for an input of a classifier. Supervised Multi-Layer Perceptron (sMLP) was added to train a classifier to predict a binary level with latent representation as an input (AE + sMLP). The second model simultaneously updated the parameters in the AE and the connected MLP classifier during the learning process (End-to-End Supervised AE [EEsAE]). We tested the performances of these two models against baseline models, eXtreme Gradient Boosting (XGB) and naïve Bayes, in the prediction of co-existing gastric cancer diagnosis.The proposed EEsAE model yielded the highest F1-score and highest area under the curve (0.86). The EEsAE and AE + sMLP gave the highest recalls. XGB yielded the highest precision. Ablation study revealed that iron deficiency anemia, gastroesophageal reflux disease, essential hypertension, gastric ulcers, benign prostate hyperplasia, and shoulder lesion were the top 6 most influential diagnoses on performance.A novel EEsAE model showed promising performance in the prediction of a disease of interest.© 2023 The Korean Academy of Medical Sciences.