一种高效的深度神经网络用于对带有小物体的大型三维图像进行分类。

An efficient deep neural network to classify large 3D images with small objects.

Original text

发表日期：2023 Aug 17

作者： Jungkyu Park, Jakub Chledowski, Stanislaw Jastrzebski, Jan Witowski, Yanqi Xu, Linda Du, Sushma Gaddam, Eric Kim, Alana Lewin, Ujas Parikh, Anastasia Plaunova, Sardius Chen, Alexandra Millet, James Park, Kristine Pysarenko, Shalin Patel, Julia Goldberg, Melanie Wegener, Linda Moy, Laura Heacock, Beatriu Reig, Krzysztof J Geras

来源： IEEE TRANSACTIONS ON MEDICAL IMAGING

摘要：

三维成像通过提供关于器官解剖的空间信息，实现了准确的诊断。然而，使用三维图像来训练人工智能模型在计算上存在挑战，因为它们的像素数量比二维图像多10倍或100倍。为了训练高分辨率的三维图像，卷积神经网络采用了对它们进行降采样或投影到二维的方法。我们提出了一种有效的替代方案，一种神经网络，可以实现对全分辨率三维医学图像的高效分类。与成品卷积神经网络相比，我们的网络，三维全局感知多实例分类器 (3D-GMIC)，使用的GPU内存量少了77.98%-90.05%，计算量减少了91.23%-96.02%。虽然它只是用图像级标签进行训练，没有使用分割标签，但它通过提供像素级显著性图解释其预测结果。在纽约大学朗格尼医院收集的数据集上，包括85,526例全场二维乳腺摄影 (FFDM)，合成二维乳腺摄影和三维乳腺摄影，3D-GMIC在使用三维乳腺摄影进行乳房恶性发现分类时的曲线下面积 (AUC) 为0.831 (95% CI: 0.769-0.887)。这与FFDM (0.816，95% CI: 0.737-0.878)和合成二维 (0.826，95% CI: 0.754-0.884) 上GMIC的性能相当，表明3D-GMIC能够成功分类大型三维图像，尽管计算重点放在了输入的较小部分上，与GMIC相比，3D-GMIC能够识别和利用由数亿像素组成的三维图像中极小的感兴趣区域，从而显著降低了相关计算挑战。3D-GMIC在来自杜克大学医院的外部数据集BCS-DBT上表现良好，其AUC为0.848 (95% CI: 0.798-0.896)。

3D imaging enables accurate diagnosis by providing spatial information about organ anatomy. However, using 3D images to train AI models is computationally challenging because they consist of 10x or 100x more pixels than their 2D counterparts. To be trained with high-resolution 3D images, convolutional neural networks resort to downsampling them or projecting them to 2D. We propose an effective alternative, a neural network that enables efficient classification of full-resolution 3D medical images. Compared to off-the-shelf convolutional neural networks, our network, 3D Globally-Aware Multiple Instance Classifier (3D-GMIC), uses 77.98%-90.05% less GPU memory and 91.23%-96.02% less computation. While it is trained only with image-level labels, without segmentation labels, it explains its predictions by providing pixel-level saliency maps. On a dataset collected at NYU Langone Health, including 85,526 patients with full-field 2D mammography (FFDM), synthetic 2D mammography, and 3D mammography, 3D-GMIC achieves an AUC of 0.831 (95% CI: 0.769-0.887) in classifying breasts with malignant findings using 3D mammography. This is comparable to the performance of GMIC on FFDM (0.816, 95% CI: 0.737-0.878) and synthetic 2D (0.826, 95% CI: 0.754-0.884), which demonstrates that 3D-GMIC successfully classified large 3D images despite focusing computation on a smaller percentage of its input compared to GMIC. Therefore, 3D-GMIC identifies and utilizes extremely small regions of interest from 3D images consisting of hundreds of millions of pixels, dramatically reducing associated computational challenges. 3D-GMIC generalizes well to BCS-DBT, an external dataset from Duke University Hospital, achieving an AUC of 0.848 (95% CI: 0.798-0.896).