通过贝叶斯变量选择识别动态基因共表达的全基因组搜索算法。
Genome-wide search algorithms for identifying dynamic gene co-expression via Bayesian variable selection.
发表日期:2023 Oct 08
作者:
Wenda Zhang, Zichen Ma, Lianming Wang, Daping Fan, Yen-Yi Ho
来源:
STATISTICS IN MEDICINE
摘要:
高通量技术产生的大量基因表达数据为系统地研究基因间相互作用提供了令人兴奋的机会。生物系统中基因与基因的相互作用受到严格调控,并且通常是高度动态的。这种相互作用可以在各种内部细胞信号或外部刺激下灵活变化。先前的研究已经开发出统计方法来检查基因-基因相互作用的这些动态变化。然而,由于典型的基因组数据集中需要考虑大量可能的基因组合,因此密集计算是探索基因间相互作用的常见挑战。另一方面,通常只有一小部分基因组合表现出动态共表达变化。为了解决这个问题,我们提出了基于尖峰和平板先验的贝叶斯变量选择方法。所提出的算法通过专注于识别搜索空间中有希望的基因组合的子集来降低计算强度。我们还采用贝叶斯多重假设检验程序来识别强动态基因共表达变化。进行模拟研究以将所提出的方法与现有的穷举搜索启发法进行比较。我们使用癌症基因组图谱乳腺癌 BRCA-US 项目的 RNA 测序数据集展示了我们提出的研究基因共表达模式与总体生存率之间关联的方法的实施情况。© 2023 作者。约翰·威利出版的《医学统计》
A wealth of gene expression data generated by high-throughput techniques provides exciting opportunities for studying gene-gene interactions systematically. Gene-gene interactions in a biological system are tightly regulated and are often highly dynamic. The interactions can change flexibly under various internal cellular signals or external stimuli. Previous studies have developed statistical methods to examine these dynamic changes in gene-gene interactions. However, due to the massive number of possible gene combinations that need to be considered in a typical genomic dataset, intensive computation is a common challenge for exploring gene-gene interactions. On the other hand, oftentimes only a small proportion of gene combinations exhibit dynamic co-expression changes. To solve this problem, we propose Bayesian variable selection approaches based on spike-and-slab priors. The proposed algorithms reduce the computational intensity by focusing on identifying subsets of promising gene combinations in the search space. We also adopt a Bayesian multiple hypothesis testing procedure to identify strong dynamic gene co-expression changes. Simulation studies are performed to compare the proposed approaches with existing exhaustive search heuristics. We demonstrate the implementation of our proposed approach to study the association between gene co-expression patterns and overall survival using the RNA-sequencing dataset from The Cancer Genome Atlas breast cancer BRCA-US project.© 2023 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.