一种基于蛋白质组学的新型模型,通过集成生物信息学分析和机器学习来预测日本血吸虫合并感染的结直肠癌。
A novel proteomic-based model for predicting colorectal cancer with Schistosoma japonicum co-infection by integrated bioinformatics analysis and machine learning.
发表日期:2023 Oct 30
作者:
Shan Li, Xuguang Sun, Ting Li, Yanqing Shi, Binjie Xu, Yuyong Deng, Sifan Wang
来源:
GENOMICS PROTEOMICS & BIOINFORMATICS
摘要:
日本血吸虫感染是一个重要的公共卫生问题,并且日本血吸虫感染与多种疾病相关,包括结直肠癌。我们根据标准程序收集了有或没有日本血吸虫感染的结直肠癌患者的石蜡样本。数据独立采集用于识别差异表达蛋白(DEP)、蛋白质-蛋白质相互作用(PPI)网络构建、基因本体论(GO)和京都基因和基因组百科全书(KEGG)功能富集分析和机器学习算法(至少绝对值)使用收缩和选择算子(LASSO)回归)来识别诊断患有日本血吸虫感染的结直肠癌的候选基因。为了评估诊断价值,开发了列线图和受试者工作特征(ROC)曲线。共筛选了115个DEP,发现的DEP大多与前体代谢物和能量产生的生物过程、有机化合物氧化衍生能量、羧酸代谢过程、含氧酸代谢过程、细胞呼吸有氧呼吸等相关。的分析。富集分析表明,这些化合物可能调节氧化还原酶活性、转运蛋白活性、跨膜转运蛋白活性、离子跨膜转运蛋白活性和无机分子实体跨膜转运蛋白活性。随着PPI网络和LASSO的发展,筛选出13个基因(hsd17b4、h2ac4、hla-c、pc、epx、rpia、tor1aip1、mindy1、dpysl5、nucks1、cnot2、ndufa13和dnm3),并选出3个候选hub基因用于机器学习后的列线图构建和诊断价值评估。列线图和所有 3 个候选中心基因(hsd17b4、rpia 和 cnot2)具有很高的诊断价值(曲线下面积为 0.9556)。我们的研究结果表明,hsd17b4、rpia 和 cnot2 的组合可能成为 CRC 合并日本血吸虫感染发生的预测模型。该研究也为日本血吸虫感染和CRC的机制研究提供了新的线索。©2023。作者。
Schistosoma japonicum infection is an important public health problem and the S. japonicum infection is associated with a variety of diseases, including colorectal cancer. We collected the paraffin samples of CRC patients with or without S. japonicum infection according to standard procedures. Data-Independent Acquisition was used to identify differentially expressed proteins (DEPs), protein-protein interaction (PPI) network construction, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) functional enrichment analysis and machine learning algorithms (least absolute shrinkage and selection operator (LASSO) regression) were used to identify candidate genes for diagnosing CRC with S. japonicum infection. To assess the diagnostic value, the nomogram and receiver operating characteristic (ROC) curve were developed. A total of 115 DEPs were screened, the DEPs that were discovered were mostly related with biological process in generation of precursor metabolites and energy,energy derivation by oxidation of organic compounds, carboxylic acid metabolic process, oxoacid metabolic process, cellular respiration aerobic respiration according to the analyses. Enrichment analysis showed that these compounds might regulate oxidoreductase activity, transporter activity, transmembrane transporter activity, ion transmembrane transporter activity and inorganic molecular entity transmembrane transporter activity. Following the development of PPI network and LASSO, 13 genes (hsd17b4, h2ac4, hla-c, pc, epx, rpia, tor1aip1, mindy1, dpysl5, nucks1, cnot2, ndufa13 and dnm3) were filtered, and 3 candidate hub genes were chosen for nomogram building and diagnostic value evaluation after machine learning. The nomogram and all 3 candidate hub genes (hsd17b4, rpia and cnot2) had high diagnostic values (area under the curve is 0.9556). The results of our study indicate that the combination of hsd17b4, rpia, and cnot2 may become a predictive model for the occurrence of CRC in combination with S. japonicum infection. This study also provides new clues for the mechanism research of S. japonicum infection and CRC.© 2023. The Author(s).