中文摘要
作为一种表观遗传学标记,CpG位点甲基化通过与DNA结合蛋白(如转录因子)的互作实现调控基因表达的功能,参与了多种生物学过程及疾病,如肿瘤的发生发展。目前尚无针对性的泛癌症(pan-cancer)研究在全基因组范围鉴定各类肿瘤特异性的DNA甲基化谱及其调控功能。特别是对DNA甲基化参与的转录因子靶向调控目标基因的过程,缺乏系统的鉴定与研究。本项目计划利用TCGA大规模多组学数据,从不同角度对22种癌症类型的DNA甲基化谱进行全面的分析和梳理,为后续研究提供一系列数据资源及分析结果。我们将开发针对性的大数据整合分析流程,系统研究各类肿瘤中DNA甲基化对基因表达的调控作用,构建并使用高通量数据验证DNA甲基化参与的转录调控网络。该癌症类型特异性的一系列网络是首次系统阐述DNA甲基化参与的转录调控,对于鉴定基因异常表达的驱动因子、筛选功能性甲基化位点、协助肿瘤的精准分型均有重要的利用与研究价值。
英文摘要
As a major type of the epigenetic markers, CpG site DNA methylation has been shown to play critical roles in pluripotency, development, and various diseases. Aberrant DNA methylation patterns, such as hyper- or hypo-methylation of certain CpG sites or regions compared to normal cells, have been associated with tumorigenesis of many types of cancer. The major function of DNA methylation is believed to be primarily mediated by its interplay with various transcription factors (TFs) via different and complicated mechanisms. During the decades of research about DNA methylation, most of the previous studies have been focused on the regulations of individual genes by methylation of specific CpG sites or regions. Here we are proposing to obtain comprehensive views of the DNA methylation architecture in cancers and to dissect its regulatory function on specific genes via interactive circuits, in a genome-wide scale and pan-cancer manner...The Cancer Genome Atlas (TCGA) project now has generated large collections of multidimensional molecular profiles of tumor samples, from almost all major cancers, with large dynamic ranges and independent sampling, thereby serving as a very valuable resource to mine associative and regulatory interactions at various levels. In this proposed study, we plan to take advantages of the multi-omics profiling data of tumors in TCGA consortium, and to systematically interrogate the promoter DNA methylation architectures and their regulatory functions. We will perform a multitude of analyses on the methylome of 22 cancer types to reveal the similarities and distinctiveness among various types of cancers, and to explore the power of DNA methylation patterns in defining cancer (sub-)types. To dissect the regulatory functions of promoter CpG site methylations on gene expression regulation, we will develop an integrative analysis pipeline based on conditional mutual information, to quantify the cooperative regulatory effects of CpG site methylation and transcription factor activity on gene expressions, on a genome-wide scale. The results will be assembled as DNA Methylation-dependent Transcription Regulatory Network (MeTRN) for each of the 22 major cancer types in TCGA. These transcriptional regulatory circuitries will be largely cross-validated, mostly in a context-specific manner, by known TF binding motifs, high-throughput ChIP-seq, and DNaseI-seq datasets. To sum, the integrative analysis strategy will provide a systematic view on gene expression regulations driven by the interplay between TFs and DNA methylations. These context-dependent transcriptional regulation patterns will be used to differentiate and re-cluster cancer (sub-)types, which imply the differences and similarities on the mechanisms of tumorigenesis, thereby laying basis for a better understanding of cancer development. Such a systematic view on the methylation-dependent transcriptional regulatory circuitry would also provide a framework to dissect the driving force of gene expression regulation that is attributed to dysregulation of the DNA methylation status.
