研究动态
Articles below are published ahead of final publication in an issue. Please cite articles in the following format: authors, (year), title, journal, DOI.

使用基于排序的方法对具有大量生存数据的半参数加速失效时间模型进行最佳子采样。

Optimal subsampling for semi-parametric accelerated failure time models with massive survival data using a rank-based approach.

发表日期:2024 Aug 20
作者: Zehan Yang, HaiYing Wang, Jun Yan
来源: STATISTICS IN MEDICINE

摘要:

二次采样是分析大量生存数据的实用策略,这些数据在不同的研究领域中逐渐遇到。虽然最优子采样方法已应用于 Cox 模型和参数加速失效时间 (AFT) 模型的推理,但其在基于排序估计的半参数 AFT 模型中的应用受到的关注有限。挑战来自于回归系数的非平滑估计函数以及常见形式的估计函数中审查观测的贡献看似为零。为了应对这些挑战,我们通过明确定义的随机过程表达估计函数,为事件和审查观测开发最佳子采样概率。同时,我们对非平滑估计函数应用诱导平滑过程。由于最佳子采样概率取决于未知的回归系数,因此我们采用两步过程来获得可行的估计方法。该方法的另一个好处是它能够解决当子样本大小接近完整样本大小时方差估计不足的问题。我们通过模拟研究验证估计器的性能,并应用这些方法来分析监测、流行病学和最终结果计划中淋巴瘤患者的生存时间。© 2024 John Wiley
Subsampling is a practical strategy for analyzing vast survival data, which are progressively encountered across diverse research domains. While the optimal subsampling method has been applied to inferences for Cox models and parametric accelerated failure time (AFT) models, its application to semi-parametric AFT models with rank-based estimation have received limited attention. The challenges arise from the non-smooth estimating function for regression coefficients and the seemingly zero contribution from censored observations in estimating functions in the commonly seen form. To address these challenges, we develop optimal subsampling probabilities for both event and censored observations by expressing the estimating functions through a well-defined stochastic process. Meanwhile, we apply an induced smoothing procedure to the non-smooth estimating functions. As the optimal subsampling probabilities depend on the unknown regression coefficients, we employ a two-step procedure to obtain a feasible estimation method. An additional benefit of the method is its ability to resolve the issue of underestimation of the variance when the subsample size approaches the full sample size. We validate the performance of our estimators through a simulation study and apply the methods to analyze the survival time of lymphoma patients in the surveillance, epidemiology, and end results program.© 2024 John Wiley & Sons Ltd.