研究动态
Articles below are published ahead of final publication in an issue. Please cite articles in the following format: authors, (year), title, journal, DOI.

用于建模观察者间变异性的潜在医生模型。

The Latent Doctor Model for Modeling Inter-Observer Variability.

发表日期:2023 Oct 13
作者: Jasper Linmans, Emiel Hoogeboom, Jeroen van der Laak, Geert Litjens
来源: IEEE Journal of Biomedical and Health Informatics

摘要:

医学成像中许多本质上不明确的任务都受到观察者间变异的影响,从而导致由具有高方差的标签分布定义的参考标准。仅根据共识或多数投票标签进行培训(这在医学成像中很常见),会丢弃专家小组中有关不确定性的有价值的信息。在这项工作中,我们建议对完整的标签分布进行训练,以预测专家小组内的不确定性和最可能的真实标签。为此,我们提出了一种基于条件变分自动编码器的新随机分类框架,我们将其称为潜在医生模型(LDM)。在广泛的比较分析中,我们将 LDM 与在多数投票标签上训练的模型以及其他能够学习标签分布的方法进行比较。我们表明,LDM 能够比多数投票基线更好地再现参考标准分布。与其他基线方法相比,我们证明 LDM 在对两个前列腺肿瘤分级任务中的标签分布及其相应的不确定性进行建模方面表现最佳。此外,我们在肿瘤出芽分类任务中展示了 LDM 与计算要求更高的深度集成的竞争性能。
Many inherently ambiguous tasks in medical imaging suffer from inter-observer variability, resulting in a reference standard defined by a distribution of labels with high variance. Training only on a consensus or majority vote label, as is common in medical imaging, discards valuable information on uncertainty amongst a panel of experts. In this work, we propose to train on the full label distribution to predict the uncertainty within a panel of experts and the most likely ground-truth label. To do so, we propose a new stochastic classification framework based on the conditional variational auto-encoder, which we refer to as the Latent Doctor Model (LDM). In an extensive comparative analysis, we compare the LDM with a model trained on the majority vote label and other methods capable of learning a distribution of labels. We show that the LDM is able to reproduce the reference-standard distribution significantly better than the majority vote baseline. Compared to the other baseline methods, we demonstrate that the LDM performs best at modeling the label distribution and its corresponding uncertainty in two prostate tumor grading tasks. Furthermore, we show competitive performance of the LDM with the more computationally demanding deep ensembles on a tumor budding classification task.