通过图正则化NMF学习乳腺癌中普通和特定任务的放射组学特征，共同预测多个临床指标。

Learning Common and Task-specific Radiomic Features via Graph Regularized NMF for The Joint Prediction of Multiple Clinical Indicators in Breast Cancer.

Original text

发表日期：2023 Aug 07

作者： Jian Guan, Ming Fan, Tieyong Zeng, Lihua Li

来源： IEEE Journal of Biomedical and Health Informatics

摘要：

基于磁共振成像（MRI）的放射性组学分析，对乳腺癌的诊断、预后和治疗会有多种临床指标的评估有益。很多机器学习方法已经被设计出来用于同时预测多个指标，以获得更准确的评估，同时直接使用原始的临床标签，而不考虑它们之间的噪声和冗余信息。为了实现这一目的，我们提出了一种基于标签空间维度减少（LSDR）的多标签学习方法，通过图正则化的非负矩阵分解（CTFGNMF）学习共同特征和任务特定特征，用于乳腺癌多个指标的联合预测。采用非负矩阵分解（NMF）将原始临床标签映射到低维潜在空间。利用最小二乘损失函数和l2,1范数正则化的方法，利用潜在标签来挖掘任务之间的相关性，识别出共同特征，从而提高相关任务的泛化性能。此外，通过多任务回归模式保留任务特定特征，增强不同任务的判别力。共同特征和任务特定特征通过动态图拉普拉斯正则化结合到一个统一模型中，学习互补特征。然后，建立多标签分类模型来预测包括人表皮生长因子受体2（HER2）、Ki-67和组织学分级在内的多个临床指标。实验结果显示，CTFGNMF在三个指标预测中的AUC分别为0.823、0.691和0.776，优于其他只考虑任务独立特征或共同特征的方法。这表明CTFGNMF是乳腺癌多个分类任务中有前途的应用。

Assessments of multiple clinical indicators based on radiomic analysis of magnetic resonance imaging (MRI) are beneficial to the diagnosis, prognosis and treatment of breast cancer patients. Many machine learning methods have been designed to jointly predict multiple indicators for more accurate assessments while using original clinical labels directly without considering the noisy and redundant information among them. To this end, we propose a multilabel learning method based on label space dimensionality reduction (LSDR), which learns common and task-specific features via graph regularized nonnegative matrix factorization (CTFGNMF) for the joint prediction of multiple indicators in breast cancer. A nonnegative matrix factorization (NMF) is adopted to map original clinical labels to a low-dimensional latent space. The latent labels are employed to exploit task correlations by using a least square loss function with l2,1-norm regularization to identify common features, which help to improve the generalization performance of correlated tasks. Furthermore, task-specific features were retained by a multitask regression formulation to increase the discrimination power for different tasks. Common and task-specific features are incorporated by dynamic graph Laplacian regularization into a unified model to learn complementary features. Then, a multilabel classification is built to predict multiple clinical indicators including human epidermal growth factor receptor 2 (HER2), Ki-67, and histological grade. Experimental results show that CTFGNMF achieves AUCs of 0.823, 0.691 and 0.776 in the three indicator predictions, outperforming other counterparts that consider only task-independent features or common features. It indicates CTFGNMF is a promising application for multiple classification tasks in breast cancer.