基于组织学图像的乳腺癌分类的层次图 V-Net 模型及半监督预训练

A Hierarchical Graph V-Net with Semi-supervised Pre-training for Histological Image based Breast Cancer Classification.

Original text

发表日期：2023 Sep 19

作者： Yonghao Li, Yiqing Shen, Jiadong Zhang, Shujie Song, Zhenhui Li, Jing Ke, Dinggang Shen

来源： IEEE TRANSACTIONS ON MEDICAL IMAGING

摘要：

最近，已经提出了许多基于组织学图像的乳腺癌分类的基于裂纹的方法。然而，由于忽略整个组织切片图像（WSI）中的空间上下文信息，它们的性能可能受到很大影响。为了解决这个问题，我们提出了一种新颖的层次化Graph V-Net，它集成了1)基于裂缝的预训练和2)基于上下文的微调，使用层次化图网络。具体地，首先开发了一个基于知识蒸馏的半监督框架，用于预训练裂缝编码器以提取与疾病相关的特征。然后，设计了一个层次化的Graph V-Net，用于从相邻/相似的个体裂缝中构建层次化的图表示，进行从粗到细的分类，其中每个图节点（对应一个裂缝）附有提取的与疾病相关的特征，并且其目标标签在训练过程中是相应裂缝中所有像素的平均标签。为了评估我们提出的层次化Graph V-Net的性能，我们收集了一个包含560个WSI的大型WSI数据集，其中包括来自BACH数据集的30个标记的WSI（通过我们进一步的细化得到），来自云南肿瘤医院的30个标记的WSI和500个未标记的WSI。这些500个未标记的WSI被用于裂缝级别的预训练，以改善特征表示，而60个标记的WSI被用于训练和测试我们提出的层次化Graph V-Net。比较评估和消融研究表明，我们提出的层次化Graph V-Net在从WSI中分类乳腺癌方面优于最先进的方法。数据集的源代码和我们对BACH数据集的注释已经在https://github.com/lyhkevin/Graph-V-Net上发布。

Numerous patch-based methods have recently been proposed for histological image based breast cancer classification. However, their performance could be highly affected by ignoring spatial contextual information in the whole slide image (WSI). To address this issue, we propose a novel hierarchical Graph V-Net by integrating 1) patch-level pre-training and 2) context-based fine-tuning, with a hierarchical graph network. Specifically, a semi-supervised framework based on knowledge distillation is first developed to pre-train a patch encoder for extracting disease-relevant features. Then, a hierarchical Graph V-Net is designed to construct a hierarchical graph representation from neighboring/similar individual patches for coarse-to-fine classification, where each graph node (corresponding to one patch) is attached with extracted disease-relevant features and its target label during training is the average label of all pixels in the corresponding patch. To evaluate the performance of our proposed hierarchical Graph V-Net, we collect a large WSI dataset of 560 WSIs, with 30 labeled WSIs from the BACH dataset (through our further refinement), 30 labeled WSIs and 500 unlabeled WSIs from Yunnan Cancer Hospital. Those 500 unlabeled WSIs are employed for patch-level pre-training to improve feature representation, while 60 labeled WSIs are used to train and test our proposed hierarchical Graph V-Net. Both comparative assessment and ablation studies demonstrate the superiority of our proposed hierarchical Graph V-Net over state-of-the-art methods in classifying breast cancer from WSIs. The source code and our annotations for the BACH dataset have been released at https://github.com/lyhkevin/Graph-V-Net.