研究动态
Articles below are published ahead of final publication in an issue. Please cite articles in the following format: authors, (year), title, journal, DOI.

应用广义Berk-Jones统计量和复合检验进行全基因组多中介者分析。

Genome-wide multi-mediator analyses using the generalized Berk-Jones statistics with the composite test.

发表日期:2023 Sep 04
作者: En-Yu Lai, Yen-Tsung Huang
来源: BIOINFORMATICS

摘要:

本文拟进行中介分析,以评估一种假设的因果机制对从某一暴露因素通过中介变量最终影响结果的效应。在高通量技术时代,常常需要评估基因组或蛋白质组尺度上的大量潜在机制。与此同时,也必须解决与多重检验相关的问题。在只有少数几个基因或蛋白质参与因果关系的情况下,传统的评估中介效应的方法因无法得到这一实验背后的组合空值分布而失去统计功效。因此,功效损失在多重检验校正后会减少鉴定到的真实机制。为了公平地勾画出组合空值下的均匀分布,黄(2019年,AoAS)提出了组合检验,以为单中介分析提供了调整后的P值。我们的贡献是将该方法扩展到多中介分析,它们常见于基因组研究,并且适用于各种生物学兴趣。我们使用组合检验中的广义Berk-Jones统计量,提出了一个多元分析方法,有利于密集和多样的中介效应,一个有利于稀疏和一致效应的去相关方法,以及一个捕捉两种方法边缘的混合方法。我们的分析套件已经作为R软件包MACtest实现。通过分析癌症基因组图谱及临床蛋白质肿瘤分析联盟提供的肺腺癌数据集,我们进一步研究了受吸烟诱导的表观遗传异常调控的基因和网络。R软件包MACtest可在https://github.com/roqe/MACtest上获得。补充数据可在Bioinformatics在线提供。© 2023年作者。由Oxford University Press出版。
Mediation analysis is performed to evaluate the effects of a hypothetical causal mechanism that marks the progression from an exposure, through mediators, to an outcome. In the age of high-throughput technologies, it has become routine to assess numerous potential mechanisms at the genome or proteome scales. Alongside this, the necessity to address issues related to multiple testing has also arisen. In a sparse scenario where only a few genes or proteins are causally involved, conventional methods for assessing mediation effects lose statistical power because the composite null distribution behind this experiment can not be attained. The power loss hence decreases the true mechanisms identified after multiple testing corrections. To fairly delineate a uniform distribution under the composite null, Huang (2019, AoAS) proposed the composite test to provide adjusted p-values for single-mediator analyses.Our contribution is to extend the method to multi-mediator analyses, which are commonly encountered in genomic studies and also flexible to various biological interests. Using the generalized Berk-Jones statistics with the composite test, we proposed a multivariate approach that favors dense and diverse mediation effects, a decorrelation approach that favors sparse and consistent effects, and a hybrid approach that captures the edges of both approaches. Our analysis suite has been implemented as an R package MACtest. The utility is demonstrated by analyzing the lung adenocarcinoma datasets from The Cancer Genome Atlas and Clinical Proteomic Tumor Analysis Consortium. We further investigate the genes and networks whose expression may be regulated by smoking-induced epigenetic aberrations.An R package MACtest is available on https://github.com/roqe/MACtest.Supplementary data are available at Bioinformatics online.© The Author(s) 2023. Published by Oxford University Press.