手机版 客户端

基于人群的生物医学多层面数据整合方法及肿瘤风险预测研究

基于人群的生物医学多层面数据整合方法及肿瘤风险预测研究
  • 导航:首页 > 科学基金
  • 批准号:81530088
  • 批准年度: 2015年
  • 学科分类:流行病学方法与卫生统计(H2611) |
  • 项目负责人:陈峰
  • 负责人职称:教授
  • 依托单位:南京医科大学
  • 资助金额:274万元
  • 项目类别:重点项目
  • 研究期限:2016年01月01日 至 2020年12月31日
  • 中文关键词: 生物医学;整合;肿瘤;风险;预测
  • 英文关键词:big data;integrative analysis;information entropy;large-scale cohort;risk assessment

项目摘要

中文摘要

生物医学数据来源广泛,涉及个体、群体环境暴露、遗传变异、DNA甲基化、基因表达等多个层面。常规研究往往仅利用某一层面单个完全数据集进行分析,忽视了多层面数据间的关系。本课题拟采用“初步筛选→再次筛选→精细建模→人群验证”的分析思路,利用大数据思维,对基于人群的生物医学多层面数据进行整合分析,探索肺癌、胃癌等常见肿瘤的复杂关联因素,建立风险预测模型,提高预测精度。拟充分考虑各层面间的结构、调控关系等生物先验信息,提出加权信息熵法,快速富集具有主效应或层面内、跨层面基因-基因、基因-环境交互作用信息的基因;提出Bayes序贯分析法,逐层整合数据,更高效地筛选预测因素;改进因果中介分析模型,探索多层面因素的作用方式及强度;将所建方法尝试应用于肺癌、胃癌的关联分析及风险预测模型的建立,并基于大规模人群队列进行验证。

英文摘要

The biomedical big data (BBD), generated from a variety of sources and multiple layers, include personal-level exposure data, population-level environmental exposure information, high-resolution medical images, electronic health records, as well as data from high-throughput genomic platforms such as DNA sequencing, DNA methylation, gene expression, et al.. Most of previous studies only focused on the dataset from a single layer, ignoring the association among the multiple layers in BBD. In this study, we aim to develop more effective statistical methods for BBD integration to improve understanding of and provide insights into biomedical big data. Following strategy will be applied in the study: a) Preliminary fast screening of the risk factors; b) Fine evaluation of the risk factors; c) Building risk prediction model; d) Validation in independent populations. To further understand the sophisticated association among factors and risk of cancers, we will propose entropy based weighted information gain (WIG) method to efficiently enrich the genes carrying main effects, interactions within a single layer, interactions among multiple layers, as well as interactions with environment. Majority advantage of WIG method is utilizing the prior biological information into subsequencing analysis, such as molecular processes and regulatory relationships. Further, we will propose a Bayesian sequential method to integrate data from multi-layers to provide a better prediction of cancer risk. Furthermore, we will use the improved causal mediation analysis to explore the potential causal pathways. The proposed methods will be applied to lung cancer and gastric cancer. Risk factors and prediction models will also be explored and validated in large-scale cohorts.

评估说明

    国家自然科学基金项目“基于人群的生物医学多层面数据整合方法及肿瘤风险预测研究”发布于爱科学iikx,并永久归类于相关科学基金导航中,仅供广大科研工作者查询、学习、选题参考。国科金是根据国家发展科学技术的方针、政策和规划,以及科学技术发展方向,面向全国资助基础研究和应用研究,发挥着促进我国基础研究源头创新的作用。国科金的真正价值在于它能否为科学进步和社会发展带来积极的影响。

此文由 爱科学 编辑!:首页 > 科学基金 > 科学基金3 » 基于人群的生物医学多层面数据整合方法及肿瘤风险预测研究

推荐文章