中文摘要
本项目将在前期获得的基线食管癌前病变代谢组学数据基础上,继续随访采集由不典型增生到食管癌变过程中的动态图谱,建立纵向队列设计的食管癌动态代谢组学研究。数据分析难题在于如何处理依时混杂因素的干扰、高维纵向数据下建模和变量筛选、以及如何利用动态数据优势推断代谢物因果关系,目前仍缺少有效的分析方法。本项目基于逆概率加权(IPW)建立边际结构模型,校正依时混杂因素对因果效应估计的影响;高维纵向数据建模上,通过定义IPW树切分准则、双层重抽样策略、模型和变量重要性评价指标,构建随机边际结构森林的组合模型;进一步建立代谢物因果关系推断的Cross-lagged通径分析。预期目标是构建适合高维动态代谢组学数据的变量筛选和因果推断方法,确定食管癌前病变阶段动态变化的代谢组标记物和代谢通路。项目为高维动态代谢组学数据统计分析提供新的研究思路和方法,而且对于食管癌高危人群的早期干预和预防具有重要的实际意义。
英文摘要
Based on the well-prepared baseline metabolomics data of esophageal precancerous lesion, this project aims to follow-up these baseline samples to obtain the repeated measurement of dynamic spectrum from dysplasia to esophageal squamous cell carcinoma (ESCC) progression, so as to establish a longitudinal cohort based dynamic metabolomics study of ESCC. The challenges of high-dimensional dynamic metabolomics data analysis are how to adjust for the time-dependent confounders, high-dimensional data modeling and variable selection, causal inference of metabolic pathway by taking advantage of dynamic data. However, there is still a lack of effective statistical methods right now. In this study, inverse probability weighting (IPW) is used to construct marginal structural model by adjusting for the time-dependent confounders; Then, a random marginal structural forest (RMSF) is proposed for high-dimensional longitudinal data analysis, by defining IPW tree segmentation criteria, two-level bootstrap resampling strategy, model and variables importance evaluation index; Cross-lagged path analysis is further proposed for causal inference of metabolic pathway. The objectives of this project are to build powerful method for high-dimensional dynamic metabolomics variable selection and causal inference, and then discover and identify the potential dynamic metabolic biomarkers and pathways for the early diagnosis and progression of ESCC. This project will provide new idea of metabolomics study design and statistical methods, and has important practical significance on the early intervention and prevention of ESCC in high-risk populations.
