ACP-ESM:使用面向蛋白质的变压器方法对抗癌肽进行分类的新框架。
ACP-ESM: A novel framework for classification of anticancer peptides using protein-oriented transformer approach.
发表日期:2024 Aug 20
作者:
Zeynep Hilal Kilimci, Mustafa Yalcin
来源:
ARTIFICIAL INTELLIGENCE IN MEDICINE
摘要:
抗癌肽(ACP)是一类在癌症研究和治疗领域受到广泛关注的分子。 ACP 是短链氨基酸,是蛋白质的组成部分,它们具有选择性靶向和杀死癌细胞的能力。 ACP 的主要优势之一是它们能够选择性地靶向癌细胞,同时在更大程度上保护健康细胞。这种选择性通常归因于癌细胞与正常细胞相比表面特性的差异。这就是为什么 ACP 正在作为癌症治疗的潜在候选者进行研究。 ACP 可以单独使用,也可以与化疗和放疗等其他治疗方式联合使用。虽然 ACP 有望成为一种新的癌症治疗方法,但仍存在一些挑战需要克服,包括优化其稳定性、提高选择性、增强其向癌细胞的递送、持续增加肽序列的数量、开发可靠且精确的预测模型。在这项工作中,我们提出了一种基于 Transformer 的高效框架,通过执行准确可靠且精确的预测模型来识别 ACP。为此,采用四种不同的变压器模型,即 ESM、ProtBERT、BioBERT 和 SciBERT 来检测氨基酸序列中的 ACP。为了证明所提出的框架的贡献,在文献中广泛使用的数据集(AntiCp2、cACP-DeepGram、ACP-740 的两个版本)上进行了大量的实验。实验结果表明,与文献研究相比,所提出的模型的使用提高了分类准确性。所提出的框架 ESM 在 AntiCp2 数据集上表现出 96.45% 的准确率,在 cACP-DeepGram 数据集上表现出 97.66% 的准确率,在 ACP-740 数据集上表现出 88.51% 的准确率,从而确定了新的最先进技术。所提议框架的代码可在 github (https://github.com/mstf-yalcin/acp-esm) 上公开获取。版权所有 © 2024 Elsevier B.V. 保留所有权利。
Anticancer peptides (ACPs) are a class of molecules that have gained significant attention in the field of cancer research and therapy. ACPs are short chains of amino acids, the building blocks of proteins, and they possess the ability to selectively target and kill cancer cells. One of the key advantages of ACPs is their ability to selectively target cancer cells while sparing healthy cells to a greater extent. This selectivity is often attributed to differences in the surface properties of cancer cells compared to normal cells. That is why ACPs are being investigated as potential candidates for cancer therapy. ACPs may be used alone or in combination with other treatment modalities like chemotherapy and radiation therapy. While ACPs hold promise as a novel approach to cancer treatment, there are challenges to overcome, including optimizing their stability, improving selectivity, and enhancing their delivery to cancer cells, continuous increasing in number of peptide sequences, developing a reliable and precise prediction model. In this work, we propose an efficient transformer-based framework to identify ACPs for by performing accurate a reliable and precise prediction model. For this purpose, four different transformer models, namely ESM, ProtBERT, BioBERT, and SciBERT are employed to detect ACPs from amino acid sequences. To demonstrate the contribution of the proposed framework, extensive experiments are carried on widely-used datasets in the literature, two versions of AntiCp2, cACP-DeepGram, ACP-740. Experiment results show the usage of proposed model enhances classification accuracy when compared to the literature studies. The proposed framework, ESM, exhibits 96.45% of accuracy for AntiCp2 dataset, 97.66% of accuracy for cACP-DeepGram dataset, and 88.51% of accuracy for ACP-740 dataset, thence determining new state-of-the-art. The code of proposed framework is publicly available at github (https://github.com/mstf-yalcin/acp-esm).Copyright © 2024 Elsevier B.V. All rights reserved.