IBRAP:集成基准化单细胞RNA测序分析流程。
IBRAP: integrated benchmarking single-cell RNA-sequencing analytical pipeline.
发表日期:2023 Feb 27
作者:
Connor H Knight, Faraz Khan, Ankit Patel, Upkar S Gill, Jessica Okosun, Jun Wang
来源:
BRIEFINGS IN BIOINFORMATICS
摘要:
单细胞核糖核酸测序(scRNA-seq)是研究细胞异质性的强大工具。该技术产生的高维数据复杂且需要专业的分析和解释技能。scRNA-seq数据分析的核心包含数个关键分析步骤,包括预处理、质量控制、标准化、降维、整合和聚类。每个步骤通常都有许多基于不同前提和影响的算法开发而成。有许多不同的工具可供选择,基准分析已比较它们的性能,并表明这些工具根据数据类型和复杂度的不同会有不同的表现。本文介绍综合基准scRNA-seq分析流程(IBRAP),它包含一系列可在整个流程中交换的分析组件以及多种基准指标,使用户能够比较结果并确定其数据的最佳流程组合。我们应用IBRAP进行单个和多个样品的整合分析,使用主胰腺组织、癌细胞系和模拟数据以及基准真实细胞标签,并展示了IBRAP的可交换性和基准功能。我们的结果证实最佳流程取决于个别样本和研究,进一步支持了我们的工具的合理性和必要性。然后,我们比较了包含在IBRAP中的基于参考的细胞注释和无监督分析,并证明了基于参考的方法在识别稳健的主要和次要细胞类型方面的优越性。因此,IBRAP是一个有价值的工具,可以集成多个样品和研究,创建正常和疾病组织的参考图,利用scRNA-seq数据的广泛量进行新的生物学发现。©作者(2023)。由牛津大学出版社出版。
Single-cell ribonucleic acid (RNA)-sequencing (scRNA-seq) is a powerful tool to study cellular heterogeneity. The high dimensional data generated from this technology are complex and require specialized expertise for analysis and interpretation. The core of scRNA-seq data analysis contains several key analytical steps, which include pre-processing, quality control, normalization, dimensionality reduction, integration and clustering. Each step often has many algorithms developed with varied underlying assumptions and implications. With such a diverse choice of tools available, benchmarking analyses have compared their performances and demonstrated that tools operate differentially according to the data types and complexity. Here, we present Integrated Benchmarking scRNA-seq Analytical Pipeline (IBRAP), which contains a suite of analytical components that can be interchanged throughout the pipeline alongside multiple benchmarking metrics that enable users to compare results and determine the optimal pipeline combinations for their data. We apply IBRAP to single- and multi-sample integration analysis using primary pancreatic tissue, cancer cell line and simulated data accompanied with ground truth cell labels, demonstrating the interchangeable and benchmarking functionality of IBRAP. Our results confirm that the optimal pipelines are dependent on individual samples and studies, further supporting the rationale and necessity of our tool. We then compare reference-based cell annotation with unsupervised analysis, both included in IBRAP, and demonstrate the superiority of the reference-based method in identifying robust major and minor cell types. Thus, IBRAP presents a valuable tool to integrate multiple samples and studies to create reference maps of normal and diseased tissues, facilitating novel biological discovery using the vast volume of scRNA-seq data available.© The Author(s) 2023. Published by Oxford University Press.