1 数据来源:
NCBI登录号 SRP133674 ,
文章:Shifting the limits in wheat research and breeding using a fully annotated reference genome
Cytosine methylation was profiled in DNA extracted from two-week old CS leaf tissue in three different contexts: CpG dinucleotides, CHG and CHH (where H corresponds to A, T or C). The frozen leaves from the five samples at 3-leaf stage (Zadok stage 13) were ground and divided as input for the preparation of both RNA-seq libraries (detailed in
Chinese Spring tissues study) and whole genome bisulfite sequencing (WGBS) libraries.
2 结果描述
Wheat DNA methylation frequency of cytosines in the sequence contexts of CpG (average 92.7%), CHG (average 51.3%) and CHH (average 2.7%). The observed levels of cytosine methylations are among the highest observed in angiosperms (161), likely reflecting the abundance of repetitive elements throughout the wheat genome. Methylation patterns in wheat largely follow those observed in other species, showing enrichment in CpG and CHG sequence contexts at pericentromeric regions(gene poor) and depletion toward the chromosome ends (gene rich).
首先看一看high confidence genes的甲基化pattern。如下图所示,在基因编码区相对较低,CpG和CHG而在上有启动子和下游则相对较高。而CHH则相对较平稳。大家分析自己的基因时可以看看是否属于这个pattern。
high confidence genes
(TSS = transcription start site; TTS = transcription termination site)
High rates of DNA methylation likely serve to prevent transposition by restricting the expression of transposable elements. However, where repetitive elements are proximal to gene sequences, the enriched methylation can perform a regulatory function, predominantly silencing expression. The distinct and highly conserved methylation patterns observed in regions of HC genes and their regulatory regions showed higher levels of DNA methylation associated with the 5’ regulatory regions in all contexts that diminished rapidly at the transcriptional start site (TSS).
而low confidence (LC) genes的甲基化pattern又是如何呢?如下图,3种类型都相对平稳。
(TSS = transcription start site; TTS = transcription termination site)
DNA methylation increased in the gene body where the CpG methylation formed a peak, whereas gene body methylation levels remained at extremely low levels at CHG and CHH sites. In the 3’ regulatory region after the transcriptional termination site (TTS) methylation rapidly reverted to the levels in 5’ sequences. This contrasted with the pattern observed for LC genes, where a near uniform level of methylation was observed in all sequence contexts. As a conclusion, many of the features included in the LC annotation are either no genes, are truncated or have lost their function through mutation (i.e. pseudogenes).

3 甲基化分析
农大的郭伟龙老师开发了甲基化mapping软件BS-Seeker2(BS-Seeker2: a versatile aligning pipeline for bisulfite sequencing data)以及后续甲基化分析软件CGmapTools(CGmapTools improves the precision of heterozygous SNV calls and supports allele-specific methylation detection and visualization in bisulfite-sequencing data)。
2、使用bs_seeker2-call_methylation.py时不要整个基因组一起call methylation,一来速度太慢,二来整个基因组一起会出现bug(其他人有没有还不清楚)。我简单的说下我的测试过程,整个基因组进行call methylation,根据程序提示如果1A部分已经运行完毕,直接停止;分离出1A的bam文件单独对1A进行call methylation;将1A和2A合并到一起call methylation。最后发现,整个基因组call methylation的结果与其它两个均不同;而无论是1A单独还是1A和2A一起call methylation,结果都是相同的。
4 Jbrowse呈现
下面我们看一个例子。GS5基因在水稻中控制水稻的粒形和粒重,在小麦里中GS5(TraesCS3A02G212900LC, TraesCS3B02G277100LC和TraesCS3D02G172900)也已经被多个课题组同源克隆,其中3B基因有两处大插入,破坏了基因结构。从甲基化水平上来看,两处插入序列的甲基化水平较高(如下图)。