TEs as a supply of regulatory elements
1. promoter
In genetics, a promoter is a sequence of DNA to which proteins bind that initiate transcription of a single RNA from the DNA downstream of it. Promoters can be about 100–1000 base pairs long.
Of the 2004 sequences analyzed, 475 (,24%) promoters contain TE-derived sequences, making up ,8% of the total nucleotides in all of the promoters .数据分析发现调控元件序列中含有各类转座元件
To demonstrate unequivocally an effect of TEs on the regulation of host genes, it is necessary to show that experimentally characterized cis-regulatory elements that bind nuclear transcription factors have been derived from TE sequences. 要证明转座元件对宿主基因组的调控功能必须实验证实它们可以和转录因子发生作用
Transcription Factor Database (TRANSFAC; http://transfac.gbf.de/TRANSFAC/) . A total of 846 experimentally characterized human cis-regulatory sites from 288 gene.然后就发现转座元件与这些顺式调控元件的位点是overlap的
2. 转座元件驱动癌基因表达
Genetic mutation, gene amplification and chromosomal rearrangement are three classic genetic mechanisms that drive cancer progression and identity,but they provide an incomplete explanation for oncogene activation.癌基因的激活机制多种多样
Although epigenetically silenced in somatic tissues,TEs can become active in cancer due to DNA hypomethylation, which can expose regulatory sequences and lead to functional consequences.肿瘤中基因的CpG岛高甲基化,全基因组低甲基化导致TEs表达增加
Indeed, some TEs are epigenetically reactivated as cryptic promoters to drive oncogene expression in cancer, a process known as onco-exaptation.但是我们还不知道在癌症中“转座子驱动癌基因表达”这种机制是否普遍存在
In addition, we detected at least one onco-exaptation event in 49.7% of all tumors, with prevalence ranging from 12 to 87% across cancer types, indicating that onco-exaptation could be a promiscuous mechanism for oncogene activation.Various TEs activating an in-frame isoform of the same gene .泛癌分析证明这是一个普遍机制,并且其中有些事件是在各种类型的癌症中普遍存在,而另一些则是某类癌症中特异存在的。
Eight of these candidates were predicted to form in-frame transcripts that conserve protein sequence, suggesting preservation of oncogene function. Onco-exaptation candidates include isoforms of genes such as SALL4 and LIN28B that have recently emerged as potent cancer drivers.Half of the top candidates were associated with worse survival in at least one cancer type. Top10里面这些融合转录本都还保留了癌基因的功能
HERVH-SLCO1B3 transcript, a previously characterized oncoexaptation event, is abundant across various cancer types, highly expressed and associated with worse prognosis
[1] Babaian, A. et al. Onco-exaptation of an endogenous retroviral LTR drives IRF5 expression in Hodgkin lymphoma. Oncogene 35, 2542–2546 (2016).
HERVH-SLCO1B3这个在胶质瘤中也确实高表达,与不良预后相关。
For validation, we sought to confirm transcription initiation from a few exapted TEs. We queried the FANTOM5 promoter database and discovered five out of the ten most prevalent onco-exaptation candidates show promoter signature.发现在这些“转座子驱动癌基因”事件中,有一般的转座子都表现出了启动子信号
We validated a few FANTOM5 results by mapping transcription start sites (TSS) with cap analysis of gene expression (CAGE)-seq in the H727 lung carcinoid cell line. Indeed, SYT1 and ARID3A oncogenes are transcribed from alternative promoters located in TEs.在肺癌类细胞系中利用加帽端测序鉴定了TEs作为癌基因的启动子
The AluJb TE is located 20 kilobases (kb) upstream of the canonical promoter of LIN28B and drives the majority of LIN28B’s expression in a substantial number of tumors.AluJb-LIN28B是我们在肺癌的RNA-Seq数据中发现的最常见的
To verify the existence of he AluJb-LIN28B isoform in lung cancer cell lines, we profiled TSSs in the H1299 and H838 cell lines by using paired-end CAGE-seq.In the H1299 and H838 cell lines by using paired-end CAGE-seq. We confirmed a CAGE peak, composed of mate reads that align to LIN28B, which spans ~40 base pairs (bp) in the AluJb element in both cell lines.Next, we profiled DNA methylation levels and chromatin accessibility by using WGBS-seq and ATAC-seq, respectively . The AluJb TE is completely methylated in somatic tissues profiled by Roadmap (http://www.roadmapepigenomics.org/) . In H1299, the region surrounding the AluJb promoter (AluJb-P) is unmethylated, whereas in H838, it is ~50% methylated. In both cell lines, the region displayed accessibility, indicating an open chromatin state. Together, these findings suggest that an AluJb TE is epigenetically reactivated as an alternative promoter to drive LIN28B expression in lung cancer cell lines.在两个肺癌细胞系中CAGE-Seq鉴定转录起始位点,LIN28B的起始位点确实有40bp的AluJb的重合峰。检测转座元件的表观修饰情况(染色质可接近性和甲基化)均表明,在正常组织中AluJb位点是关闭状态,但是在肺癌细胞中呈现开放状态。
Next, we dissected the genetic determinants behind the AluJb-LIN28B onco-exaptation event.truncated AluJb and MLT1B(见上图紫色两边的框), upstream of AluJb-P.Luciferase assays using various combinations of TEs before a luciferase reporter showed that vectors without AluJb-P displayed minimal activity (Fig. 2c). Furthermore, the luciferase activity did not diminish in the solo AluJb-P vector relative to other vectors. These results illustrate that AluJb-P contains all the necessary sequences for strong promoter activity, and the upstream TEs have minimal cis-regulatory effect on AluJb-P transcription.因为转座子有结合转录因子的能力,所以探究这些转座子是否对promoter的功能有影响,双荧光素酶发现只有AluJb-P有启动子的作用
AluJb is a primate-specific subfamily in the short interspersed nuclear element (SINE) class of TEs. SINE elements are known to recruit RNA polymerase (RNAP) III to generate short transcripts that can potentially be retrotransposed . However, most messenger RNAs are typically transcribed by RNAP II. We hypothesized that AluJb-P accumulated mutations through evolution that generated novel transcription factor binding sites that recruit RNAP II.We then identified potential novel transcription factor motifs that were generated by mutations specific to AluJb-P with FIMO35.但是AluJb是属于SINE的被RNA-Poll3转录,所以我们猜测它应该是产生了某些突变,使得它可以作为启动子的部分接受RNA-Poll2的转录。我们果然鉴定出来4个新的转录因子结合位点(见上图)
To interrogate the functional importance of these motifs, we cloned AluJb-P sequences mutagenized for each motif into a luciferase reporter and assessed the change in promoter activity.鉴定这些预测到的结合转录因子的结构域都是有功能得到,突变后,荧光素酶实验信号降低。
Western blots verified the expected size difference between the onco-exapted AluJB-LIN28B isoform present in H1299 and H838 cells compared to the canonical LIN28B protein present in K562 and HepG2.有两个问题需要鉴定——1. 这个融合蛋白相比起正常蛋白增多的部分是由于转座元件吗?2. 这个融合蛋白有功能吗?
To confirm that the larger protein originated from AluJb-P, we performed CRISPR-Cas9-mediated deletion of AluJb-P in H1299 and H838 . In addition, we deleted a 1-kb sequence of the canonical LIN28B promoter (LIN28BP). The deletion of AluJb-P abolished the larger LIN28B protein, while the deletion of LIN28BP did not .AluJb-LIN28B protein is identical to canonical LIN28B,aside from the additional N-terminal amino acids.利用CRISPR进行敲除,分别在AluJb-P去掉1kb或者在传统的LIN28B,发现只有AluJb-P敲除可以破坏这个大片段的蛋白,可以确定AluJb-LIN28B相比于经典的LIN28B多出来的那一部分是由于AluJb-P
LIN28B represses let-7 miRNAs,ultimately contributing to oncogenesis through the upregulation of oncogenes such as MYC and RAS.这是LIN28引发癌变的功能机制,在敲除之后,我们在不同细胞系均观察到let表达上调,癌细胞的生长速率也变慢,迁移率降低。
H1299&H838细胞系中均显示敲除抑制肿瘤生长,但是K562细胞中无效。这说明了特异性
K562 人慢性髓系白血病细胞
THP-1人单核白血病细胞
H1299 (人非小细胞肺癌细胞)
H838(人非小细胞肺癌细胞)
最后,我们如何解决这种“转座子驱动的癌基因的表达?”,作者利用CRISPR-SunTag系统调节AluJb位点的甲基化状态(在H1299里面募集DNMT3A增加甲基化,在K562里面募集TET1CD去甲基化),发现可以显著调节融合蛋白的表达。
[1] Jordan I K, Rogozin I B, Glazko G V, et al. Origin of a substantial fraction of human regulatory sequences from transposable elements[J]. Trends in Genetics, 2003, 19(2): 68-72.
[2] Forrest, A. R. R. et al. A promoter-level mammalian expression atlas. Nature 507, 462–470 (2014).
[3] https://zhuanlan.zhihu.com/p/50743961
[4] ang H S, Shah N M, Du A Y, et al. Transposable elements drive widespread expression of oncogenes in human cancers.[J]. Nature Genetics, 2019, 51(4): 611-617.