将预处理任务个性化以提高自监督学习在肺腺癌组织学亚型分类中的应用效果。
Tailoring pretext tasks to improve self-supervised learning in histopathologic subtype classification of lung adenocarcinomas.
发表日期:2023 Sep 16
作者:
Ruiwen Ding, Anil Yadav, Erika Rodriguez, Ana Cristina Araujo Lemos da Silva, William Hsu
来源:
COMPUTERS IN BIOLOGY AND MEDICINE
摘要:
肺腺癌(LUAD)是一种形态异质性较大的疾病,具有五种主要的组织学亚型。完全监督的卷积神经网络可以通过血染血纤维蛋白原和嗜酸性染剂(H&E)染色的全切片图像(WSIs)来提高LUAD组织学亚型划分的准确性并减少主观性。然而,开发具有良好预测准确性的监督模型通常需要详尽的手动数据标注,这是费时且劳动密集的。本研究提出了三种自监督学习(SSL)预训练任务以减少标注工作量。这些任务不仅利用了H&E WSIs的多分辨率特性,还明确考虑了与分类LUAD组织学亚型的下游任务相关性。其中两个任务涉及预测从低放大率和高放大率WSIs裁剪出的瓦片之间的空间关系。我们假设这些任务能够使模型学会区分图像中呈现的不同组织结构,从而有利于下游分类。第三个任务涉及从血染剂中预测嗜酸性染剂,使模型学习与LUAD亚型相关的细胞质特征。通过与其他现有公开数据库上的基于预训练和SSL方法进行比较,演示了三种提议的SSL任务及其组合的有效性。我们的工作可以扩展到其他需要重视组织结构信息的癌症类型。该模型可以加速并加强常规病理诊断任务的过程。代码可在https://github.com/rina-ding/ssl_luad_classification获得。版权所有©2023作者。由Elsevier Ltd.出版。保留所有权利。
Lung adenocarcinoma (LUAD) is a morphologically heterogeneous disease with five predominant histologic subtypes. Fully supervised convolutional neural networks can improve the accuracy and reduce the subjectivity of LUAD histologic subtyping using hematoxylin and eosin (H&E)-stained whole slide images (WSIs). However, developing supervised models with good prediction accuracy usually requires extensive manual data annotation, which is time-consuming and labor-intensive. This work proposes three self-supervised learning (SSL) pretext tasks to reduce labeling effort. These tasks not only leverage the multi-resolution nature of the H&E WSIs but also explicitly consider the relevance to the downstream task of classifying the LUAD histologic subtypes. Two tasks involve predicting the spatial relationship between tiles cropped from lower and higher magnification WSIs. We hypothesize that these tasks induce the model to learn to distinguish different tissue structures presented in the images, thus benefiting the downstream classification. The third task involves predicting the eosin stain from the hematoxylin stain, inducing the model to learn cytoplasmic features relevant to LUAD subtypes. The effectiveness of the three proposed SSL tasks and their ensemble was demonstrated by comparison with other state-of-the-art pretraining and SSL methods using three publicly available datasets. Our work can be extended to any other cancer type where tissue architectural information is important. The model could be used to expedite and complement the process of routine pathology diagnosis tasks. The code is available at https://github.com/rina-ding/ssl_luad_classification.Copyright © 2023 The Authors. Published by Elsevier Ltd.. All rights reserved.