自我监督的深度学习方法在高效空间免疫表型研究中的应用
Self-supervised deep learning for highly efficient spatial immunophenotyping.
发表日期:2023 Sep 04
作者:
Hanyun Zhang, Khalid AbdulJabbar, Tami Grunewald, Ayse U Akarca, Yeman Hagos, Faranak Sobhani, Catherine S Y Lecat, Dominic Patel, Lydia Lee, Manuel Rodriguez-Justo, Kwee Yong, Jonathan A Ledermann, John Le Quesne, E Shelley Hwang, Teresa Marafioti, Yinyin Yuan
来源:
EBioMedicine
摘要:
高效的生物标记物发现和临床转化依赖于重要技术(如多重成像)的快速和准确的分析结果。然而,可靠的细胞分类通常需要大量的注释。迫切需要标签高效的策略,以揭示大规模多重数据集中的多样细胞分布和空间相互作用。本研究提出了一种基于自监督学习进行抗原检测的方法(SANDI),用于准确的细胞表型分析,同时减轻注释负担。该模型首先学习未标记细胞图像中的内在成对相似性,然后通过分类步骤使用一小组已注释参考细胞来将学习到的特征映射到细胞标签。我们获取了四个多重免疫组织化学数据集和一个成像质谱细胞学数据集,训练和测试模型包括2825至15258个单细胞图像。在1%的注释(18-114个细胞)下,SANDI在五个数据集上的加权F1分数介于0.82至0.98之间,与在1828-11459个已注释细胞上训练的完全监督分类器相当(加权F1分数的平均差为-0.002至-0.053,威尔科克森秩和检验,P = 0.31)。利用卵巢癌切片上染色的免疫检查点标记物,SANDI基于细胞鉴定揭示了PD1表达T辅助细胞和T调节细胞之间的空间驱逐,这表明PD1表达和T调节细胞介导的免疫抑制之间存在相互作用。通过在专家指导的最小程度和深度学习在充分数据中学习相似性之间取得平衡,SANDI为组织学多重成像数据的高效大规模学习提供了新机会。本研究由皇家马斯登/ICR国家卫生研究生物医学研究中心资助。版权所有©2023作者。由Elsevier B.V.出版。保留所有权利。
Efficient biomarker discovery and clinical translation depend on the fast and accurate analytical output from crucial technologies such as multiplex imaging. However, reliable cell classification often requires extensive annotations. Label-efficient strategies are urgently needed to reveal diverse cell distribution and spatial interactions in large-scale multiplex datasets.This study proposed Self-supervised Learning for Antigen Detection (SANDI) for accurate cell phenotyping while mitigating the annotation burden. The model first learns intrinsic pairwise similarities in unlabelled cell images, followed by a classification step to map learnt features to cell labels using a small set of annotated references. We acquired four multiplex immunohistochemistry datasets and one imaging mass cytometry dataset, comprising 2825 to 15,258 single-cell images to train and test the model.With 1% annotations (18-114 cells), SANDI achieved weighted F1-scores ranging from 0.82 to 0.98 across the five datasets, which was comparable to the fully supervised classifier trained on 1828-11,459 annotated cells (-0.002 to -0.053 of averaged weighted F1-score, Wilcoxon rank-sum test, P = 0.31). Leveraging the immune checkpoint markers stained in ovarian cancer slides, SANDI-based cell identification reveals spatial expulsion between PD1-expressing T helper cells and T regulatory cells, suggesting an interplay between PD1 expression and T regulatory cell-mediated immunosuppression.By striking a fine balance between minimal expert guidance and the power of deep learning to learn similarity within abundant data, SANDI presents new opportunities for efficient, large-scale learning for histology multiplex imaging data.This study was funded by the Royal Marsden/ICR National Institute of Health Research Biomedical Research Centre.Copyright © 2023 The Authors. Published by Elsevier B.V. All rights reserved.