单细胞和批量 RNA 测序的综合分析揭示了肺腺癌中巨噬细胞极化相关基因的预后特征。
Integrative Analysis of Single-Cell and Bulk RNA Sequencing Reveals Prognostic Characteristics of Macrophage Polarization-Related Genes in Lung Adenocarcinoma.
发表日期:2023
作者:
Ke Mi, Lizhong Zeng, Yang Chen, Shuanying Yang
来源:
GENES & DEVELOPMENT
摘要:
肺腺癌(LUAD)是一组预后不良的癌症。单细胞RNA测序(scRNA-seq)和bulk RNA测序(RNA-seq)的结合可以从更广阔的角度识别参与癌症发生和进展的重要基因。LUAD的scRNA-seq数据和bulk RNA-seq数据是从基因表达综合(GEO)数据库和癌症基因组图谱(TCGA)数据库下载的。分析 GSE131907 数据集中核心细胞的 scRNA-seq,并使用均匀流形近似和投影 (UMAP) 进行降维和聚类识别。分析后从TCGA-LUAD数据集中获取巨噬细胞极化相关亚型,然后进一步鉴定TCGA-LUAD数据集中的差异表达基因(DEG)(正常/LUAD组织样本,两种亚型)。维恩图用于可视化差异表达和高度可变的巨噬细胞极化相关基因。随后,通过单变量Cox和最小绝对收缩和选择算子(LASSO)构建了LUAD患者的预后风险模型,并在外部数据GSE72094中考察了该模型的稳定性。在分析性状基因和显着突变基因之间的相关性后,检查高/低风险人群之间的免疫渗透情况。应用Monocle包对巨噬细胞簇中不同细胞簇的伪时间轨迹进行分析。随后,选择数据巨噬细胞的细胞簇作为关键细胞簇,以探索特征基因在不同细胞群体中的作用,并识别影响特征基因的转录因子(TF)。最后,采用qPCR验证LUAD中预后特征基因的表达水平。分别从巨噬细胞簇、巨噬细胞极化相关亚型和正常/LUAD组织样本中获得424个巨噬细胞高变异基因、3920个DEG和9561个DEG。 。获得了28个差异表达和高度突变的MPRG。构建了包含 7 个 DE-MPRG(RGS13、ADRB2、DDIT4、MS4A2、ALDH2、CTSH 和 PKM)的预后风险模型。该预测模型在GSE72094数据集中仍然具有良好的预测效果。 ZNF536、DNAH9在低危组发生突变,COL11A1在高危组发生突变,且与特征基因高度相关。高/低风险组中共有11个免疫细胞存在显着差异。巨噬细胞簇中再次鉴定出五种细胞类型,然后NK细胞:CD56hiCD62L分化较早,主要存在于2个分支上。而巨噬细胞则存在于两个分支上并随后分化。发现簇1中BCLAF1和MAX的表达量较高,这可能是影响特征基因表达的TF。此外,qPCR证实预后基因的表达与生物信息学分析的结果基本一致。7个MPRG(RGS13、ADRB2、DDIT4、MS4A2、ALDH2、CTSH和PKM)被鉴定为LUAD的预后基因,并揭示了MPRGs 在单细胞水平上的作用机制。© 2023 Mi 等人。
Lung adenocarcinoma (LUAD) is a group of cancers with poor prognosis. The combination of single-cell RNA sequencing (scRNA-seq) and bulk RNA sequencing (RNA-seq) can identify important genes involved in cancer development and progression from a broader perspective.The scRNA-seq data and bulk RNA-seq data of LUAD were downloaded from the Gene Expression Omnibus (GEO) database and the Cancer Genome Atlas (TCGA) database. Analyzing scRNA-seq for core cells in the GSE131907 dataset, and the uniform manifold approximation and projection (UMAP) was used for dimensionality reduction and cluster identification. Macrophage polarization-associated subtypes were acquired from the TCGA-LUAD dataset after analysis, followed by further identification of differentially expressed genes (DEGs) in the TCGA-LUAD dataset (normal/LUAD tissue samples, two subtypes). Venn diagrams were utilized to visualize differentially expressed and highly variable macrophage polarization-related genes. Subsequently, a prognostic risk model for LUAD patients was constructed by univariate Cox and Least Absolute Shrinkage and Selection Operator (LASSO), and the model was investigated for stability in the external data GSE72094. After analyzing the correlation between the trait genes and significantly mutated genes, the immune infiltration between the high/low-risk groups was then examined. The Monocle package was applied to analyze the pseudo-temporal trajectory analysis of different cell clusters in macrophage clusters. Subsequently, cell clusters of data macrophages were selected as key cell clusters to explore the role of characteristic genes in different cell populations and to identify transcription factors (TFs) that affect signature genes. Finally, qPCR were employed to validate the expression levels of prognosis signature genes in LUAD.424 macrophage highly variable genes, 3920 DEGs, and 9561 DEGs were obtained from macrophage clusters, the macrophage polarization-related subtypes, and normal/LUAD tissue samples, respectively. Twenty-eight differentially expressed and highly mutated MPRGs were obtained. A prognostic risk model with 7 DE-MPRGs (RGS13, ADRB2, DDIT4, MS4A2, ALDH2, CTSH, and PKM) was constructed. This prognostic model still has a good prediction effect in the GSE72094 dataset. ZNF536 and DNAH9 were mutated in the low-risk group, while COL11A1 was mutated in the high-risk group, and they were highly correlated with the characteristic genes. A total of 11 immune cells were significantly different in the high/low-risk groups. Five cell types were again identified in the macrophage cluster, and then NK cells: CD56hiCD62L+ differentiated earlier and were present mainly on 2 branches. While macrophages were present on 2 branches and differentiated later. It was found that the expression levels of BCLAF1 and MAX were higher in cluster 1, which might be the TFs affecting the expression of the characteristic genes. Moreover, qPCR confirmed that the expression of the prognosis genes was generally consistent with the results of the bioinformatic analysis.Seven MPRGs (RGS13, ADRB2, DDIT4, MS4A2, ALDH2, CTSH, and PKM) were identified as prognostic genes for LUAD and revealed the mechanisms of MPRGs at the single-cell level.© 2023 Mi et al.