中文摘要
预测模型被广泛应用于预后结局或患病状态的预测及兴趣事件影响因素的探索和研究中,在队列研究和纵向数据分析中具有重要的作用。然而,在研究关心的结局变量具有时间相依性,影响结局变量状态的主要因素也随着研究的进展不断发生变化的情况下,现有的统计分析方法在参数估计和预测结果中都存在较大偏倚。这种情况常出现在出生缺陷、广泛性发育障碍、肿瘤和心理疾病等慢性进展性疾病及生存质量相关的纵向研究中。针对此问题,申请人以线性模型为基础,采用机器学习算法和Bayes算法解决模型参数估计问题,充分利用纵向研究中最新收集到的信息并结合历史随访数据,构建基于复杂纵向数据的时间相依型变量的混合线性预测模型;通过Monte Carlo模拟和基于真实纵向数据的实证研究,评估所提出模型的统计性能。本研究所提出的新模型将为纵向数据中时间相依事件的预测和分析提供新的、高效合理的统计分析工具,为纵向数据的分析提供方法学支持。
英文摘要
Predict model has been widely used in the prediction of prognostic outcome or disease status and the exploratory studies of potential influential factors. It has obtained great importance in cohort study and the analysis of longitudinal data. However, when both the interested outcome variable is time-dependent and the main influential factors change overtime, existing methods may yield non-negligible bias in parameter estimation and outcome prediction. The situation occurs in studies about chronic progressive disease like birth defects, pervasive developmental disorder, cancer, mental disease, tumor and studies about health related quality of life. To solve this problem, in this study we will propose a novel linear mixed predict model for time-dependent outcome variable based on complicated longitudinal data with linear model, machine learning algorithms and Bayesian algorithms. The model will achieve a full use of the latest follow-up and historical data. Monte Carlo simulation and a series of empirical studies based on real longitudinal data will be conducted to evaluate the statistical performance of the model we proposed. The new model will be a novel, reasonable and efficient statistical tool for the prediction of time-dependent events in cohort studies and a methodological reference for the analysis of longitudinal data.
