研究动态
Articles below are published ahead of final publication in an issue. Please cite articles in the following format: authors, (year), title, journal, DOI.

差异转录本利用组成测量误差回归建模考虑量化不确定性的分析。

Differential transcript usage analysis incorporating quantification uncertainty via compositional measurement error regression modeling.

发表日期:2023 Apr 11
作者: Amber M Young, Scott Van Buren, Naim U Rashid
来源: BIOSTATISTICS

摘要:

差异转录使用(DTU)发生在同一基因产生的多个转录本的相对表达在不同条件下发生变化时。现有的检测DTU的方法通常依赖于计算过程,当样本数量增多时会出现速度和可伸缩性问题。在这里,我们提出了一种新的方法CompDTU,它使用组成回归来建模DTU分析中所感兴趣的每个转录本的相对丰度比例。这个过程利用了快速基于矩阵的计算使它非常适合于大样本量的DTU分析。此方法还允许测试和调整多个分类或连续协变量。此外,许多已有的DTU方法忽略了RNA测序数据中每个转录本的表达估计中的量化不确定性。我们扩展了CompDTU方法,结合RNA-seq表达量评估工具的常见输出,引入一种新的方法CompDTUme来整合量化不确定性。通过几个功率分析,我们展示了CompDTU具有良好的敏感性,并且相对于现有的方法减少了假阳性结果。此外,当对于具有高量化不确定性的基因有足够的样本量时,CompDTUme的表现也优于CompDTU,同时保持较快的速度和可伸缩性。我们使用癌症基因组图谱乳腺浸润性癌数据集的数据来激励我们的方法,具体使用740名乳腺癌患者的原发性肿瘤的RNA-seq数据。我们展示了我们的新方法大大减少了计算时间,以及在不同乳腺癌亚型之间检测数个显著DTU的新基因的能力。 ©由牛津大学出版社出版,版权所有。请发送电子邮件至journals.permissions@oup.com获得权限。
Differential transcript usage (DTU) occurs when the relative expression of multiple transcripts arising from the same gene changes between different conditions. Existing approaches to detect DTU often rely on computational procedures that can have speed and scalability issues as the number of samples increases. Here we propose a new method, CompDTU, that uses compositional regression to model the relative abundance proportions of each transcript that are of interest in DTU analyses. This procedure leverages fast matrix-based computations that make it ideally suited for DTU analysis with larger sample sizes. This method also allows for the testing of and adjustment for multiple categorical or continuous covariates. Additionally, many existing approaches for DTU ignore quantification uncertainty in the expression estimates for each transcript in RNA-seq data. We extend our CompDTU method to incorporate quantification uncertainty leveraging common output from RNA-seq expression quantification tool in a novel method CompDTUme. Through several power analyses, we show that CompDTU has excellent sensitivity and reduces false positive results relative to existing methods. Additionally, CompDTUme results in further improvements in performance over CompDTU with sufficient sample size for genes with high levels of quantification uncertainty, while also maintaining favorable speed and scalability. We motivate our methods using data from the Cancer Genome Atlas Breast Invasive Carcinoma data set, specifically using RNA-seq data from primary tumors for 740 patients with breast cancer. We show greatly reduced computation time from our new methods as well as the ability to detect several novel genes with significant DTU across different breast cancer subtypes.© The Author 2023. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.