基于分子指纹的机器学习模型开发,以选择针对JAK2蛋白的小分子抑制剂。
Development of machine learning models based on molecular fingerprints for selection of small molecule inhibitors against JAK2 protein.
发表日期:2023 Mar 16
作者:
Sharath Belenahalli Shekarappa, Shivananda Kandagalla, Julian Lee
来源:
Arthritis & Rheumatology
摘要:
Janus 激酶 2 (JAK2) 正成为治疗许多炎症疾病的潜在治疗对象,例如骨髓增生性疾病 (MPD)、癌症和类风湿性关节炎 (RA)。在本研究中,我们收集了包含6021种独特抑制剂的JAK2蛋白的实验数据。然后,我们基于 Morgan(ECFP6)指纹对它们进行了特征化,并根据它们的分子支架将它们聚类为训练和测试集。这些数据用于建立各种监督式机器学习(ML)算法的分类模型,以优先考虑未来针对JAK2蛋白的新型抑制剂的药物开发。采用随机森林(RF)和 Morgan 指纹构建的最佳模型在外部测试集上实现了G均值为0.84。作为我们分类模型的应用,对 Drugbank 分子进行了虚拟筛选,以识别基于RF模型置信度分数的潜在抑制剂。识别出了九种潜在分子,进一步进行分子对接研究,以评估最佳RF模型的虚拟筛选结果。这种方法可以证明对开发新型靶向特异性JAK2抑制剂非常有用。© 2023 Wiley Periodicals LLC.
Janus kinase 2 (JAK2) is emerging as a potential therapeutic target for many inflammatory diseases such as myeloproliferative disorders (MPD), cancer and rheumatoid arthritis (RA). In this study, we have collected experimental data of JAK2 protein containing 6021 unique inhibitors. We then characterized them based on Morgan (ECFP6) fingerprints followed by clustering into training and test set based on their molecular scaffolds. These data were used to build the classification models with various supervised machine learning (ML) algorithms that could prioritize novel inhibitors for future drug development against JAK2 protein. The best model built by Random Forest (RF) and Morgan fingerprints achieved the G-mean value of 0.84 on the external test set. As an application of our classification model, virtual screening was performed against Drugbank molecules in order to identify the potential inhibitors based on the confidence score by RF model. Nine potential molecules were identified, which were further subject to molecular docking studies to evaluate the virtual screening results of the best RF model. This proposed method can prove useful for developing novel target-specific JAK2 inhibitors.© 2023 Wiley Periodicals LLC.