标签集对基于深度学习的 MRI 前列腺分割的影响。
Label-set impact on deep learning-based prostate segmentation on MRI.
发表日期:2023 Sep 25
作者:
Jakob Meglič, Mohammed R S Sunoqrot, Tone Frost Bathen, Mattijs Elschot
来源:
Insights into Imaging
摘要:
前列腺分割是前列腺癌计算机辅助检测和诊断系统的重要步骤。基于深度学习 (DL) 的方法为前列腺和区域分割提供了良好的性能,但人们对手动分割(即标签)选择对其性能的影响知之甚少。在这项工作中,我们通过为 PROSTATEx I 挑战训练数据集 (n = 198) 获取两个不同的专家标签集来研究这些影响,并使用它们以及内部数据集 (n = 233) 来评估效果关于分割性能。我们使用的自动分割方法是nnU-Net。训练/测试标签集的选择对模型性能有显着的(p< 0.001)影响。此外,我们发现,当使用相同的标签集训练和测试模型时,模型性能显着提高 (p<0.001)。此外,结果表明自动分割之间的一致性显着(p< 0.0001)高于手动分割之间的一致性,并且模型能够优于用于训练它们的人类标签集。我们研究了标签集选择的影响基于深度学习的前列腺分割模型的性能。我们发现使用不同组的手动前列腺和区域分割对模型性能具有可测量的影响。尽管如此,基于深度学习的分割似乎比手动分割具有更好的读者间一致性。应该更多地考虑标签集,重点是多中心手动分割和通用程序的一致。标签集的选择显着影响基于深度学习的前列腺分割模型的性能。使用不同标签集的模型比手动分割表现出更高的一致性。•标签集选择对自动分割模型的性能有显着影响。 • 基于深度学习的模型展示了真正的学习,而不是简单地模仿标签集。 • 自动分割似乎比手动分割具有更高的读者间一致性。© 2023。欧洲放射学会(ESR)。
Prostate segmentation is an essential step in computer-aided detection and diagnosis systems for prostate cancer. Deep learning (DL)-based methods provide good performance for prostate gland and zones segmentation, but little is known about the impact of manual segmentation (that is, label) selection on their performance. In this work, we investigated these effects by obtaining two different expert label-sets for the PROSTATEx I challenge training dataset (n = 198) and using them, in addition to an in-house dataset (n = 233), to assess the effect on segmentation performance. The automatic segmentation method we used was nnU-Net.The selection of training/testing label-set had a significant (p < 0.001) impact on model performance. Furthermore, it was found that model performance was significantly (p < 0.001) higher when the model was trained and tested with the same label-set. Moreover, the results showed that agreement between automatic segmentations was significantly (p < 0.0001) higher than agreement between manual segmentations and that the models were able to outperform the human label-sets used to train them.We investigated the impact of label-set selection on the performance of a DL-based prostate segmentation model. We found that the use of different sets of manual prostate gland and zone segmentations has a measurable impact on model performance. Nevertheless, DL-based segmentation appeared to have a greater inter-reader agreement than manual segmentation. More thought should be given to the label-set, with a focus on multicenter manual segmentation and agreement on common procedures.Label-set selection significantly impacts the performance of a deep learning-based prostate segmentation model. Models using different label-set showed higher agreement than manual segmentations.• Label-set selection has a significant impact on the performance of automatic segmentation models. • Deep learning-based models demonstrated true learning rather than simply mimicking the label-set. • Automatic segmentation appears to have a greater inter-reader agreement than manual segmentation.© 2023. European Society of Radiology (ESR).