研究动态
Articles below are published ahead of final publication in an issue. Please cite articles in the following format: authors, (year), title, journal, DOI.

人工智能的全民评估和放射科医生对筛查乳房X光检查的评估。

Population-wide evaluation of artificial intelligence and radiologist assessment of screening mammograms.

发表日期:2023 Nov 08
作者: Johanne Kühl, Mohammad Talal Elhakim, Sarah Wordenskjold Stougaard, Benjamin Schnack Brandt Rasmussen, Mads Nielsen, Oke Gerke, Lisbet Brønsro Larsen, Ole Graumann
来源: EUROPEAN RADIOLOGY

摘要:

与初诊乳腺放射科医生相比,验证人工智能系统对整个筛查人群的独立乳腺癌检测。所有乳房 X 光检查均于 2014 年 8 月 4 日至 2018 年 8 月 15 日期间在丹麦南部地区进行,并进行随访24 个月内符合资格。乳腺放射科医生通过双重阅读和仲裁将筛查结果评估为正常或异常。对于正常或异常的 AI 决策,通过匹配第一读者的平均灵敏度 (AIsens) 和特异性 (AIspec) 应用两个 AI 评分截止点。准确度指标包括敏感性、特异性、阳性预测值 (PPV)、阴性预测值 (NPV) 和召回率 (RR)。样本包括 249,402 例筛查(149,495 名女性)和 2033 例乳腺癌(72.6% 筛查检出癌症,27.4% 筛查检出癌症)。 %间隔癌症)。与第一批读者相比,AIsens 的特异性(97.5% vs 97.7%;p<0.0001)和 PPV(17.5% vs 18.7%;p=0.01)较低,RR 较高(3.0% vs 2.8%;p<0.0001)。 AIspec 在所有准确度测量方面与第一批读者相当。 AIsens 和 AIspec 检测到的筛查检测癌症明显较少(1166 (AIsens)、1156 (AIspec) 对比 1252;p< 0.0001),但与第一批读者相比发现了更多的间期癌症(126 (AIsens)、117 (AIspec) 对比 39;p<0.0001)。 p< 0.0001),在多个亚组中检测到不同类型的癌症。当 AI 阈值与第一读者特异性匹配时,独立 AI 可以以相当于第一读者标准的准确度水平检测乳腺癌。然而,人工智能和第一批读者检测到了不同的癌症成分。用具有适当截止分数的人工智能取代第一批读者是可行的。放射科医生未检测到的人工智能检测到的癌症表明,如果实施人工智能以支持筛查中的双重读取,则检测到的癌症数量可能会增加,尽管检测到的癌症的临床病理特征不会发生显着变化。• 将独立人工智能癌症检测与首次检测进行比较双重阅读乳房X光检查人群的读者。 • 首次匹配的独立AI 读者特异性显示总体准确性没有统计学上的显着差异,但检测到了不同的癌症。 • 通过适当的阈值,人工智能集成筛查可以增加检测到具有相似临床病理特征的癌症的数量。© 2023。作者。
To validate an AI system for standalone breast cancer detection on an entire screening population in comparison to first-reading breast radiologists.All mammography screenings performed between August 4, 2014, and August 15, 2018, in the Region of Southern Denmark with follow-up within 24 months were eligible. Screenings were assessed as normal or abnormal by breast radiologists through double reading with arbitration. For an AI decision of normal or abnormal, two AI-score cut-off points were applied by matching at mean sensitivity (AIsens) and specificity (AIspec) of first readers. Accuracy measures were sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and recall rate (RR).The sample included 249,402 screenings (149,495 women) and 2033 breast cancers (72.6% screen-detected cancers, 27.4% interval cancers). AIsens had lower specificity (97.5% vs 97.7%; p < 0.0001) and PPV (17.5% vs 18.7%; p = 0.01) and a higher RR (3.0% vs 2.8%; p < 0.0001) than first readers. AIspec was comparable to first readers in terms of all accuracy measures. Both AIsens and AIspec detected significantly fewer screen-detected cancers (1166 (AIsens), 1156 (AIspec) vs 1252; p < 0.0001) but found more interval cancers compared to first readers (126 (AIsens), 117 (AIspec) vs 39; p < 0.0001) with varying types of cancers detected across multiple subgroups.Standalone AI can detect breast cancer at an accuracy level equivalent to the standard of first readers when the AI threshold point was matched at first reader specificity. However, AI and first readers detected a different composition of cancers.Replacing first readers with AI with an appropriate cut-off score could be feasible. AI-detected cancers not detected by radiologists suggest a potential increase in the number of cancers detected if AI is implemented to support double reading within screening, although the clinicopathological characteristics of detected cancers would not change significantly.• Standalone AI cancer detection was compared to first readers in a double-read mammography screening population. • Standalone AI matched at first reader specificity showed no statistically significant difference in overall accuracy but detected different cancers. • With an appropriate threshold, AI-integrated screening can increase the number of detected cancers with similar clinicopathological characteristics.© 2023. The Author(s).