手机版 客户端

RNA-seq数据中DNA污染的鉴定和消除算法研究

RNA-seq数据中DNA污染的鉴定和消除算法研究
  • 导航:首页 > 科学基金
  • 批准号:31671368
  • 批准年度: 2016年
  • 学科分类:生物信息算法及工具(C060702) |
  • 项目负责人:石乐明
  • 负责人职称:教授
  • 依托单位:复旦大学
  • 资助金额:60万元
  • 项目类别:面上项目
  • 研究期限:2017年01月01日 至 2020年12月31日
  • 中文关键词: RNA-seq;DNA;污染;消除;算法
  • 英文关键词:Human;Data mining;Statistical analysis;RNA-seq;Accuracy

项目摘要

中文摘要

DNA残留在RNA提取过程中难以避免,并在测序过程中可被富集,影响RNA-seq结果的可靠性及临床应用。申请人前期研究发现,DNA污染普遍存在于RNA-seq数据中,且具有强烈的基因组区域偏好性;但目前尚无相关的系统研究,更无消除其影响的计算方法。本项目拟设计已知浓度DNA污染的RNA标准样品系列,产生相应的RNA-seq数据,从而:1.建立从RNA-seq数据估算DNA污染程度的校正模型;2.在全基因组区域鉴定并量化DNA污染的分布规律及序列偏好性特征;3.建立消除DNA污染的计算方法,并采用SEQC等标准数据集评估和优化方法;4.用此新算法重新分析ENCODE等数据集,比较分析结果的异同并进行实验验证,评估算法的有效性。本项目将系统地阐明DNA污染对RNA-seq结果准确度的影响,所建立的消除DNA污染影响的新算法可望广泛用于处理未来及既往RNA-seq数据,提高分析结果可靠性。

英文摘要

Residual genomic DNA (gDNA) is present in RNA isolation process and could be enriched during some RNA-seq protocols, impacting the reliability of RNA-seq results. We observed that gDNA contamination in RNA-seq data is ubiquitous and there exist sequence biases across the whole genome. However, there is no systematic study on the characteristics of such biases, and no effective experimental or computational methods are available to characterize and remove such negative impact from gDNA contamination. In this proposed study, RNA-seq data from standard RNA samples with gDNA of known concentrations will be generated in order to: 1. establish calibration models for estimating the level of gDNA contamination from RNA-seq data; 2. characterize and quantify the profiles and sequence biases of gDNA contamination at the genome scale; 3. develop a data analysis pipeline for minimizing the negative impact of gDNA contamination in RNA-seq data; 4. optimize and assess the performance of the data analysis pipeline using the huge RNA-seq data sets from the SEQC project, and apply the data analysis method to well-known data sets such as ENCODE to evaluate the degree of gDNA contamination and to validate a subset of analysis results. The proposed method may become suitable for analyzing both historical and future RNA-seq data and is expected to significantly improve the reliability of RNA-seq measurements.

评估说明

    国家自然科学基金项目“RNA-seq数据中DNA污染的鉴定和消除算法研究”发布于爱科学iikx,并永久归类于相关科学基金导航中,仅供广大科研工作者查询、学习、选题参考。国科金是根据国家发展科学技术的方针、政策和规划,以及科学技术发展方向,面向全国资助基础研究和应用研究,发挥着促进我国基础研究源头创新的作用。国科金的真正价值在于它能否为科学进步和社会发展带来积极的影响。

此文由 爱科学 编辑!:首页 > 科学基金 > 科学基金3 » RNA-seq数据中DNA污染的鉴定和消除算法研究

推荐文章