中文摘要
通过分析医学多时间点多来源的高维组学数据,可以确定各种标志物及其变化过程,更准确地对疾病进行早期诊断和预后预测。目前的预测模型,主要使用横断面数据直接拟合模型,通常难以达到预期的效果。为此,本项目针对DNA和RNA等不同来源多个时间点的数据,提出新型的深度神经网络模型和算法,主要解决多时间点不同数据的分级特征提取问题,通过数据特征进行描述和预测。研究的主要问题:实现自适应深度神经网络结构(确定隐藏层数和节点的策略)、非监督深度学习算法、各隐藏层特征的提取方法、时间序列特征提取问题、各隐藏层网络参数的优化算法、标志物筛选方法、深度神经网络并行计算。实际应用问题以目前国际上研究热点的卵巢癌术后病人为观测对象,获取其术前术后7个时间点血浆中游离的ctDNA测序(核苷酸突变和甲基化变化)、蛋白、代谢和临床信息等监测数据,运用上述模型和数据分析方法建立复发进展预测模型,并通过前瞻性研究进行验证。
英文摘要
It would have more accurate predictive performance for early diagnosis and prognosis prediction for various disease with multi-modality data collected from multiple time points. The current dels just utilize the cross-sectional information, which lose the relevant information between multiple time points and further result in the unsatisfactory predictive performance. Therefore, in this project, we aim to analyze the multi-modality data from multiple time points by novel deep neural network, which mainly establishes the predictive model with multi-modality data from multiple time points and illustrates the potential mechanism. In our project, the following several critical questions would be addressed: to establish adaptive deep neural network in order to automatically identify the number of hidden layers and nodes for each layer, to develop unsupervised deep neural network and parameter optimization n algorithms from each layer, especially for time series data, biomarker selection and validation, parallel computation for deep neural network. In order to illustrate its advantage in clinical practice, we would apply this algorithm to monitor the recurrence for ovarian cancer patients based on multi-source data at different time points. We would obtain cfDNA, proteomics and metabolomics to establish the predictive models for monitoring the ovarian cancer.
