研究动态
Articles below are published ahead of final publication in an issue. Please cite articles in the following format: authors, (year), title, journal, DOI.

建立一个具有有限人类注释的大型基因表达癌症知识库。

Building a large gene expression-cancer knowledge base with limited human annotations.

发表日期:2023 Sep 27
作者: Stefano Marchesin, Laura Menotti, Fabio Giachelle, Gianmaria Silvello, Omar Alonso
来源: Database-Oxford

摘要:

癌症预防是公共卫生需要面对的最紧迫的挑战之一。在这方面,数据驱动的研究对于协助针对癌症的医疗解决方案至关重要。为了充分利用数据驱动研究的力量,必须将组织良好的机器可读事实放入知识库(KB)中。出于这种迫切需求,我们引入了面向协作的关系提取(CORE)系统,用于通过有限的手动注释构建知识库。 CORE 基于远程监督和主动学习范式的结合,并提供适合大规模处理的无缝、透明、模块化架构。我们专注于精准医学,并建立了关于“细粒度”基因表达(癌症关联)的最大知识库,这是补充和验证癌症研究实验数据的关键。我们展示了 CORE 的稳健性并讨论了所提供的知识库的有用性。数据库 URL https://zenodo.org/record/7577127.© 作者 2023。由牛津大学出版社出版。
Cancer prevention is one of the most pressing challenges that public health needs to face. In this regard, data-driven research is central to assist medical solutions targeting cancer. To fully harness the power of data-driven research, it is imperative to have well-organized machine-readable facts into a knowledge base (KB). Motivated by this urgent need, we introduce the Collaborative Oriented Relation Extraction (CORE) system for building KBs with limited manual annotations. CORE is based on the combination of distant supervision and active learning paradigms and offers a seamless, transparent, modular architecture equipped for large-scale processing. We focus on precision medicine and build the largest KB on 'fine-grained' gene expression-cancer associations-a key to complement and validate experimental data for cancer research. We show the robustness of CORE and discuss the usefulness of the provided KB. Database URL https://zenodo.org/record/7577127.© The Author(s) 2023. Published by Oxford University Press.