用于区分胸腺瘤和胸腺囊肿的基于对比增强 CT 的深度迁移学习和联合临床放射组学模型的开发和验证:一项多中心研究。
Development and Validation of Contrast-Enhanced CT-Based Deep Transfer Learning and Combined Clinical-Radiomics Model to Discriminate Thymomas and Thymic Cysts: A Multicenter Study.
发表日期:2023 Nov 08
作者:
Yuhua Yang, Jia Cheng, Zhiwei Peng, Li Yi, Ze Lin, Anjing He, Mengni Jin, Can Cui, Ying Liu, QiWen Zhong, Minjing Zuo
来源:
ACADEMIC RADIOLOGY
摘要:
本研究旨在评估深度迁移学习(DTL)和临床放射组学在区分胸腺瘤和胸腺囊肿方面的可行性和有效性。回顾性收集1中心196例病理诊断为胸腺瘤和胸腺囊肿患者的临床和影像数据。(训练队列:n = 137;内部验证队列:n = 59)。一个独立的外部验证队列由来自中心 2 的 68 名胸腺瘤和胸腺囊肿患者组成。对增强胸部计算机断层扫描 (CT) 图像和 8 个 DTL 模型(包括 Densenet 169、Mobilenet V2、Resnet 101)进行感兴趣区域 (ROI) 描绘。 、Resnet 18、Resnet 34、Resnet 50、Vgg 13、Vgg 16 已构建。从胸腺瘤和胸腺囊肿患者CT图像的ROI中提取放射组学特征,并使用观察者内相关系数(ICC)、Spearman相关分析和最小绝对收缩和选择算子(LASSO)算法进行特征选择。使用单变量分析和多变量逻辑回归(LR)来选择临床放射学特征。使用六种机器学习分类器,包括 LR、支持向量机 (SVM)、k 最近邻 (KNN)、轻梯度提升机 (LightGBM)、自适应提升 (AdaBoost) 和多层感知器 (MLP),用于构建放射组学和临床放射学模型。融合放射组学和临床放射学模型中选定的特征来构建组合模型。受试者工作特征曲线(ROC)、校准曲线和决策曲线分析(DCA)分别用于评估模型的区分度、校准和临床效用。 Delong检验用于比较不同模型之间的AUC。采用K均值聚类将胸腺瘤或胸腺囊肿病灶细分为亚区域,利用传统放射组学方法提取特征,并通过相关分析比较放射组学和DTL模型反映瘤内异质性的能力。基于DTL的Densenet 169表现最好,内部验证队列中的 AUC 为 0.933(95% CI:0.875-0.991),外部验证队列中的 AUC 为 0.962(95% CI:0.923-1.000)。对于放射组学模型,AdaBoost 分类器在内部和外部验证队列中的 AUC 分别为 0.965 (95% CI: 0.923-1.000) 和 0.959 (95% CI: 0.919-1.000)。 LightGBM 分类器在临床放射学模型中的 AUC 为 0.805(95% CI:0.690-0.920)和 0.839(95% CI:0.736-0.943)。内部和外部验证队列中组合模型的 AUC 分别为 0.933 (95% CI: 0.866-1.000) 和 0.945 (95% CI: 0.897-0.994)。 Delong 检验的结果表明,放射组学模型、DTL 模型和组合模型在内部和外部验证队列中均优于临床放射学模型(内部验证队列中的 p 值为 0.002、0.004 和 0.033,而在外部验证队列中,p 值为 0.002、0.004 和 0.033)。外部验证队列的 p 值分别为 0.014、0.006 和 0.015)。但三个模型的性能没有统计学差异(所有 p 值均<0.05)。相关分析表明,放射组学在量化胸腺瘤和胸腺囊肿瘤内异质性差异方面表现优于DTL。开发的DTL模型以及基于放射组学和临床放射学特征的组合模型在区分胸腺囊肿和胸腺瘤方面取得了优异的诊断性能。它们可以作为辅助临床决策的潜在工具,特别是当内窥镜活检具有高风险时。版权所有 © 2023 大学放射科医生协会。由爱思唯尔公司出版。保留所有权利。
This study aims to evaluate the feasibility and effectiveness of deep transfer learning (DTL) and clinical-radiomics in differentiating thymoma from thymic cysts.Clinical and imaging data of 196 patients pathologically diagnosed with thymoma and thymic cysts were retrospectively collected from center 1. (training cohort: n = 137; internal validation cohort: n = 59). An independent external validation cohort comprised 68 thymoma and thymic cyst patients from center 2. Region of interest (ROI) delineation was performed on contrast-enhanced chest computed tomography (CT) images, and eight DTL models including Densenet 169, Mobilenet V2, Resnet 101, Resnet 18, Resnet 34, Resnet 50, Vgg 13, Vgg 16 were constructed. Radiomics features were extracted from the ROI on the CT images of thymoma and thymic cyst patients, and feature selection was performed using intra-observer correlation coefficient (ICC), Spearman correlation analysis, and least absolute shrinkage and selection operator (LASSO) algorithm. Univariate analysis and multivariable logistic regression (LR) were used to select clinical-radiological features. Six machine learning classifiers, including LR, support vector machine (SVM), k-nearest neighbors (KNN), Light Gradient Boosting Machine (LightGBM), Adaptive Boosting (AdaBoost), and Multilayer Perceptron (MLP), were used to construct Radiomics and Clinico-radiologic models. The selected features from the Radiomics and Clinico-radiologic models were fused to build a Combined model. Receiver operating characteristic curve (ROC), calibration curve, and decision curve analysis (DCA) were used to evaluate the discrimination, calibration, and clinical utility of the models, respectively. The Delong test was used to compare the AUC between different models. K-means clustering was used to subdivide the lesions of thymomas or thymic cysts into subregions, and traditional radiomics methods were used to extract features and compare the ability of Radiomics and DTL models to reflect intratumoral heterogeneity using correlation analysis.The Densenet 169 based on DTL performed the best, with AUC of 0.933 (95% CI: 0.875-0.991) in the internal validation cohort and 0.962 (95% CI: 0.923-1.000) in the external validation cohort. The AdaBoost classifier achieved AUC of 0.965 (95% CI: 0.923-1.000) and 0.959 (95% CI: 0.919-1.000) in the internal and external validation cohorts, respectively, for the Radiomics model. The LightGBM classifier achieved AUC of 0.805 (95% CI: 0.690-0.920) and 0.839 (95% CI: 0.736-0.943) in the Clinico-radiologic model. The AUC of the Combined model in the internal and external validation cohorts was 0.933 (95% CI: 0.866-1.000) and 0.945 (95% CI: 0.897-0.994), respectively. The results of the Delong test showed that the Radiomics model, DTL model, and Combined model outperformed the Clinico-radiologic model in both internal and external validation cohorts (p-values were 0.002, 0.004, and 0.033 in the internal validation cohort, while in the external validation cohort, the p-values were 0.014, 0.006, and 0.015, respectively). But there was no statistical difference in performance among the three models (all p-values <0.05). Correlation analysis showed that radiomics performed better than DTL in quantifying intratumoral heterogeneity differences between thymoma and thymic cysts.The developed DTL model and the Combined model based on radiomics and clinical-radiologic features achieved excellent diagnostic performance in differentiating thymic cysts from thymoma. They can serve as potential tools to assist clinical decision-making, particularly when endoscopic biopsy carries a high risk.Copyright © 2023 The Association of University Radiologists. Published by Elsevier Inc. All rights reserved.