使用神经网络在质谱流式分析中进行自动化且可重复的细胞识别。
Automated and reproducible cell identification in mass cytometry using neural networks.
发表日期:2023 Sep 22
作者:
Hajar Saihi, Conrad Bessant, William Alazawi
来源:
BRIEFINGS IN BIOINFORMATICS
摘要:
质谱流式技术的主要用途是识别不同的细胞类型及其在不同样品和条件下的组成、表型和功能的变化。结合不同研究的数据有可能增强这些发现在免疫学、肿瘤学和感染等不同领域的力量。然而,当前的工具缺乏可扩展、可重复和自动化的方法来整合和研究来自质谱流式细胞术的数据集,这些数据集通常使用异质方法来研究相似的样本。为了解决这些限制,我们提出了两项新的进展:(1) 名为Immunopred 的预训练细胞识别模型,无需用户定义的预期细胞类型先验知识即可自动识别免疫细胞;(2) 全自动细胞计数元数据围绕Immunopred 构建的分析管道。我们在包含 270 个独特样本的 6 个 COVID-19 研究数据集上评估了该流程,并发现了 COVID-19 更广泛的免疫环境中新的显着表型变化,而这些变化在单独分析每项研究时并未发现。如果应用广泛,我们的方法将支持在可集成细胞计数数据集的研究领域中发现新的发现。© 作者 2023。由牛津大学出版社出版。
The principal use of mass cytometry is to identify distinct cell types and changes in their composition, phenotype and function in different samples and conditions. Combining data from different studies has the potential to increase the power of these discoveries in diverse fields such as immunology, oncology and infection. However, current tools are lacking in scalable, reproducible and automated methods to integrate and study data sets from mass cytometry that often use heterogenous approaches to study similar samples. To address these limitations, we present two novel developments: (1) a pre-trained cell identification model named Immunopred that allows automated identification of immune cells without user-defined prior knowledge of expected cell types and (2) a fully automated cytometry meta-analysis pipeline built around Immunopred. We evaluated this pipeline on six COVID-19 study data sets comprising 270 unique samples and uncovered novel significant phenotypic changes in the wider immune landscape of COVID-19 that were not identified when each study was analyzed individually. Applied widely, our approach will support the discovery of novel findings in research areas where cytometry data sets are available for integration.© The Author(s) 2023. Published by Oxford University Press.