基因表达分类器ALLCatchR可识别B细胞前体白血病亚型及其在不同年龄段下的相关发育轨迹。
The Gene Expression Classifier ALLCatchR Identifies B-cell Precursor ALL Subtypes and Underlying Developmental Trajectories Across Age.
发表日期:2023 Sep
作者:
Thomas Beder, Björn-Thore Hansen, Alina M Hartmann, Johannes Zimmermann, Eric Amelunxen, Nadine Wolgast, Wencke Walter, Marketa Zaliova, Željko Antić, Philippe Chouvarine, Lorenz Bartsch, Malwine J Barz, Miriam Bultmann, Johanna Horns, Sonja Bendig, Jan Kässens, Christoph Kaleta, Gunnar Cario, Martin Schrappe, Martin Neumann, Nicola Gökbuget, Anke Katharina Bergmann, Jan Trka, Claudia Haferlach, Monika Brüggemann, Claudia D Baldus, Lorenz Bastian
来源:
Experimental Hematology & Oncology
摘要:
当前的分类(世界卫生组织-HAEM5/ICC)通过基因组驱动异常和相应的基因表达特征定义了最多26个分子B细胞前体急性淋巴细胞白血病(BCP-ALL)疾病亚型。使用转录组测序(RNA-Seq)识别驱动异常已经得到广泛应用,而用于基因表达分析的系统方法则相对较少。因此,我们开发了ALLCatchR,这是一个基于机器学习的分类器,利用RNA-Seq基因表达数据将BCP-ALL样本分配到所有21个基因表达定义的分子亚型中。ALLCatchR在实际已有亚型定义的n = 1869个转录组数据(4个队列;55%儿童/45%成年人)上进行了训练,然后在3个独立的预留队列(n = 1018;75%儿童/25%成年人)中进行了亚型分配,准确率达到了95.7%(亚型的平均灵敏度为91.1% / 特异性为99.8%)。高可信度预测在83.7%的样本中达到了98.9%的准确率。仅有1.2%的样本未被分类。ALLCatchR的性能优于现有工具,并在以前未分配的样本中鉴定出了新的驱动基因候选者。额外的模块可以对样本的爆炸计数、患者性别和免疫表型进行预测,以弥补缺少这些信息的情况。我们使用来自健康骨髓捐赠者的7个免疫荧光激活分选(FACS)排序的前体阶段,在ALLCatchR中实现了人类B淋巴分化的新RNA-Seq参考。这使得BCP-ALL样本可以被投射到这一发展轨迹上。这一发现了BCP-ALL亚型与正常淋巴分化阶段之间的共享接近模式,通过一个新的框架为BCP-ALL的发展比较拓展了免疫表型分类。ALLCatchR使得RNA-Seq在BCP-ALL诊断中的常规应用成为可能,通过系统的基因表达分析实现准确的亚型分配,并对潜在发展轨迹提供新的见解。
Current classifications (World Health Organization-HAEM5/ICC) define up to 26 molecular B-cell precursor acute lymphoblastic leukemia (BCP-ALL) disease subtypes by genomic driver aberrations and corresponding gene expression signatures. Identification of driver aberrations by transcriptome sequencing (RNA-Seq) is well established, while systematic approaches for gene expression analysis are less advanced. Therefore, we developed ALLCatchR, a machine learning-based classifier using RNA-Seq gene expression data to allocate BCP-ALL samples to all 21 gene expression-defined molecular subtypes. Trained on n = 1869 transcriptome profiles with established subtype definitions (4 cohorts; 55% pediatric / 45% adult), ALLCatchR allowed subtype allocation in 3 independent hold-out cohorts (n = 1018; 75% pediatric / 25% adult) with 95.7% accuracy (averaged sensitivity across subtypes: 91.1% / specificity: 99.8%). High-confidence predictions were achieved in 83.7% of samples with 98.9% accuracy. Only 1.2% of samples remained unclassified. ALLCatchR outperformed existing tools and identified novel driver candidates in previously unassigned samples. Additional modules provided predictions of samples blast counts, patient's sex, and immunophenotype, allowing the imputation in cases where these information are missing. We established a novel RNA-Seq reference of human B-lymphopoiesis using 7 FACS-sorted progenitor stages from healthy bone marrow donors. Implementation in ALLCatchR enabled projection of BCP-ALL samples to this trajectory. This identified shared proximity patterns of BCP-ALL subtypes to normal lymphopoiesis stages, extending immunophenotypic classifications with a novel framework for developmental comparisons of BCP-ALL. ALLCatchR enables RNA-Seq routine application for BCP-ALL diagnostics with systematic gene expression analysis for accurate subtype allocation and novel insights into underlying developmental trajectories.Copyright © 2023 the Author(s). Published by Wolters Kluwer Health, Inc. on behalf of the European Hematology Association.