研究动态
Articles below are published ahead of final publication in an issue. Please cite articles in the following format: authors, (year), title, journal, DOI.

epiTCR: 一个高度敏感的TCR-肽段结合预测器。

epiTCR: a highly sensitive predictor for TCR-peptide binding.

发表日期:2023 Apr 24
作者: My-Diem Nguyen Pham, Thanh-Nhan Nguyen, Le Son Tran, Que-Tran Bui Nguyen, Thien-Phuc Hoang Nguyen, Thi Mong Quynh Pham, Hoai-Nghia Nguyen, Hoa Giang, Minh-Duy Phan, Vy Nguyen
来源: BIOINFORMATICS

摘要:

预测T细胞受体(T-cell receptor,TCR)与由HLA分子呈现的肽之间的结合是一项极具挑战性的任务,也是免疫治疗发展的关键瓶颈。尽管现有的预测工具在它们构建的数据集上表现良好,但是在预测能够引起患者T细胞反应的表位时,它们遭遇真阳性率较低的困境。因此,我们仍需要一种基于组合现有公开数据集的大型数据集构建的改进型TCR-肽预测工具。我们从五个公共数据库(IEDB、TBAdb、VDJdb、McPAS-TCR和10X)中收集了数据,形成了一个包含超过3百万个TCR-肽对的数据集,其中3.27%是结合互作作用。我们提出了epiTCR,这是一种基于随机森林的方法,专门用于预测TCR-肽相互作用。epiTCR将TCR CDR3β序列和抗原序列作为简单的输入,这些序列由平铺的BLOSUM62编码。与其他现有工具(NetTCR、Imrex、ATM-TCR和pMTnet)相比,epiTCR表现出更高的AUC(0.98)和灵敏度(0.94),同时保持可比的预测特异性(0.9)。我们确定了七种表位,这些表位对epiTCR预测的98.67%伪阳性贡献,并对其他工具产生类似影响。我们还展示了肽序列对预测的重要影响,凸显了需要更多多样化肽段且更平衡的数据集。总之,epiTCR是最具性能的工具之一,其使用将有助于在精准癌症免疫治疗中鉴定新抗原。epiTCR可以在GitHub上下载(https://github.com/ddiem-ri-4D/epiTCR)。Bioinformatics在线提供了补充数据。©2023年由牛津大学出版社出版。
Predicting the binding between T-cell receptor (TCR) and peptide presented by HLA molecule is a highly challenging task and a key bottleneck in the development of immunotherapy. Existing prediction tools, despite exhibiting good performance on the datasets they were built with, suffer from low true positive rates when used to predict epitopes capable of eliciting T-cell responses in patients. Therefore, an improved tool for TCR-peptide prediction built upon a large dataset combining existing publicly available data is still needed.We collected data from five public databases (IEDB, TBAdb, VDJdb, McPAS-TCR, and 10X) to form a dataset of > 3 million TCR-peptide pairs, 3.27% of which were binding interactions. We proposed epiTCR, a Random Forest-based method dedicated to predicting the TCR-peptide interactions. epiTCR used simple input of TCR CDR3β sequences and antigen sequences, which are encoded by flattened BLOSUM62. epiTCR performed with AUC (0.98) and higher sensitivity (0.94) than other existing tools (NetTCR, Imrex, ATM-TCR, and pMTnet), while maintaining comparable prediction specificity (0.9). We identified seven epitopes that contributed to 98.67% of false positives predicted by epiTCR and exerted similar effects on other tools. We also demonstrated a considerable influence of peptide sequences on prediction, highlighting the need for more diverse peptides in a more balanced dataset. In conclusion, epiTCR is among the most well-performing tools thanks to the use of combined data from public sources and its use will contribute to the quest in identifying neoantigens for precision cancer immunotherapy.epiTCR is available on GitHub (https://github.com/ddiem-ri-4D/epiTCR).Supplementary data are available at Bioinformatics online.© The Author(s) 2023. Published by Oxford University Press.