研究动态
Articles below are published ahead of final publication in an issue. Please cite articles in the following format: authors, (year), title, journal, DOI.

综合分析GWAS和转录组学数据揭示了非小细胞肺癌的关键基因。

Integrative analysis of GWAS and transcriptomics data reveal key genes for non-small lung cancer.

发表日期:2023 Aug 17
作者: Xiangxiong Feng
来源: GENES & DEVELOPMENT

摘要:

肺癌是世界上最常见和致命的癌症之一。肺癌的两种主要类型是非小细胞肺癌(NSCLC)和小细胞肺癌(SCLC)。超过85%的肺癌是NSCLC。遗传因素在NSCLC的风险中起着重要作用。越来越多的研究关注于分子水平上研究风险因素。本研究旨在建立一个流程,通过将全基因组关联分析(GWAS)和转录组学数据与机器学习相结合,有效地识别NSCLC的遗传风险因素。GWAS数据集和GWAS摘要数据从GWAS目录下载,其中包括欧洲人群中的肺癌遗传变异。然后,利用GWAS摘要数据,使用名为FUMAGWAS的网络服务器对显著SNP的功能进行分析。NSCLC和非NSCLC人群的转录组学数据用于构建机器学习模型,以识别有助于预测NSCLC的关键基因。利用BART癌症网络服务器确定了最具上调和下调表达的基因,并通过文献回顾验证了这些基因的作用机制。通过利用机器学习进行GWAS和转录组学分析的综合分析,我们发现了与NSCLC相关的多个SNP和基因。该计算流程可能有助于NSCLC和其他疾病的生物标志物发现。©2023. The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature.
Lung cancer is one of the world's most common and deadly cancers. The two main types of lung cancer are non-small cell lung cancer (NSCLC) and small cell lung cancer (SCLC). More than 85% of lung cancers are NSCLC. Genetic factors play a significant role in the risk of NSCLC. Growing studies focus on studying risk factors at the molecular level. The aim of the study is to build a pipeline to integrate Genome-wide association analysis (GWAS) and transcriptomics data with machine learning to effectively identify genetic risk factors of NSCLC. GWAS datasets and GWAS summary data were downloaded from GWAS catalog, which include lung carcinoma genetic variants among the European population. Then, with the GWAS summary, data functional analysis of significant SNPs was performed using a webserver called FUMAGWAS. The transcriptomics data of NSCLC and non-NSCLC people were used to build a machine learning model to identify the key genes that help predict the NSCLC. The top up-regulation and down-regulation genes were identified by the BART cancer webserver, and the mechanistic roles of the genes were validated by literature review. By performing integrative analysis of GWAS and transcriptomics analysis using machine learning, we identified multiple SNPs and genes that related to NSCLC. The computational pipeline may facilitate the biomarker discovery for NSCLC and other diseases.© 2023. The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.