对人类多癌症中融合基因断裂点周围同源序列的系统研究-生物信息学研究其与微型非同源末端连接途径(MMEJ)可能的联系。
Systematic investigation of the homology sequences around the human fusion gene breakpoints in pan-cancer - bioinformatics study for a potential link to MMEJ.
发表日期:2023 Aug 26
作者:
Pora Kim, Himansu Kumar, Chengyuan Yang, Ruihan Luo, Jiajia Liu, Xiaobo Zhou
来源:
BRIEFINGS IN BIOINFORMATICS
摘要:
微型同源介导末端连接(MMEJ)是一种容易出错的DNA损伤修复机制,由于其能够参与基因组不稳定性引起的混合末端连接,经常导致染色体重排,并且会增加其断点(BPs)周围序列的突变负载。在本研究中,我们系统地调查了由DNA双链断裂引发的染色体重排形成的人类融合基因的基因组断点区域周围的同源序列。由于RNA-seq数据是检查融合基因的典型数据集,对于从RNA-seq数据中识别出的已知外显子连接融合断点,我们必须推断基因组断点区域的高概率。为此,我们利用我们最近开发的融合BP预测模型FusionAI计算得出的高特征重要性得分区域,并在约24K个融合基因的20K个融合BP中找到了151K个微型同源序列。通过我们的多重生物信息学研究,我们发现了序列同源性与免疫系统的关系。这项基于计算模拟的研究将对编码结构变异周围的序列同源性提供新的知识。© 2023 作者。由牛津大学出版社发表。
Microhomology-mediated end joining (MMEJ), an error-prone DNA damage repair mechanism, frequently leads to chromosomal rearrangements due to its ability to engage in promiscuous end joining of genomic instability and also leads to increasing mutational load at the sequences flanking the breakpoints (BPs). In this study, we systematically investigated the homology sequences around the genomic breakpoint area of human fusion genes, which were formed by the chromosomal rearrangements initiated by DNA double-strand breakage. Since the RNA-seq data is the typical data set to check the fusion genes, for the known exon junction fusion breakpoints identified from RNA-seq data, we have to infer the high chance of genomic breakpoint regions. For this, we utilized the high feature importance score area calculated from our recently developed fusion BP prediction model, FusionAI and identified 151 K microhomologies among ~24 K fusion BPs in 20 K fusion genes. From our multiple bioinformatics studies, we found a relationship between sequence homologies and the immune system. This in-silico study will provide novel knowledge on the sequence homologies around the coded structural variants.© The Author(s) 2023. Published by Oxford University Press.