Corresponding author: Jan Korbel
GROUP LEADER AND SENIOR SCIENTIST
Molecular Medicine Partnership Unit
SV的类型和形成机制
SV的功能影响
a. 正常情况下受调控元件特异性调节的基因
b. 异常的转录产物 remove part of a coding region or fuse different coding regions after a duplication, resulting in aberrant transcripts
c. 表达量 deletions or duplications can lead to altered doses of otherwise functionally intact elements
d. 影响调控元件作用 resulting in altered regulatory input (left) or altered gene copy number (right). Structural variants can also affect the expression of genes outside of the variants (that is, a positional effect)
1. Structural variants in human disease
1.1 Mendelian sporadic disorders involving structural variants
三体、单体、异位、CNV 会导致Smith–Magenis syndrome (SMS), Williams–Beuren syndrome (WBS) and Potocki–Lupski syndrome (PLS) 等疾病。
Large low-copy-number repeats (also known as segmental duplications片段重复)附近更容易发生NAHR,从而导致recurrent DNA rearrangements. The presence of these characteristic genomic sequence features has prompted the use of such large low-copy-number repeats to identify novel genomic loci that are susceptible to structural variant formation.
Recurrent structural variants can lead to variable phenotypes, with distinct clinical manifestations. 如22q11区域的部分缺失可以导致DiGeorge syndrome (DGS)或velocardiofacial (VCF) syndrome。有时候一个基因上的SV可以导致多种表型,有时候多种基因SV才会导致表型改变。
涉及Mb大小的SV会影响several gene loci (either protein-coding or non-protein-coding functional heritable units),更可能是影响了gene dosage作用
在WBS, SMS and PLS models中,更可能是positional effects在起作用。
DECIPHER注释,20%疾病由duplications引起, whereas 80%由deletions引起, duplications in the DECIPHER database与更轻的表型有关。而且,deletions can unmask recessive alleles that are present in the remaining copy of the respective region,, and can expose inactive imprinted genes.
人类的一些染色体复制(三体)是不致命的(21、18、13和X染色体三体),但除了X染色体三体外,所有的单体在胚胎上都是致命的。
1.2 Complex disease and structural variants.
GWASs and the construction of population reference sets for structural variants,have helped to uncover many associations between structural variants and such disorders, including attention deficit hyperactivity disorder, Crohn's disease, rheumatoid arthritis, and type 1 and type 2 diabetes.
接下来,要更多发现 rare de novo structural variant formation events.
Interestingly, structural variants associated with complex phenotypes frequently intersect with Mendelian disease loci.
预测结构变异的表型后果仍然是困难的,部分原因是我们对单倍不足(半剂量有害)或剂量敏感(增加和减少基因剂量都有害)的了解不完全。到目前为止,人类还没有类似的数据,但间接证据表明,人类对基因剂量变化也有一定的耐受性。
1.3 Challenges in interpreting structural variant disease phenotypes.
结构变异常常表现出不同的表达性和外显率,这给它们的确定增加了困难。环境因素和/或其他遗传变异可能有助于可变表达率和外显率。除了对受结构变异影响的区域存在额外的等位基因变异外,涉及不同位点的上位相互作用还可促进表型改变。
2. Molecular consequences of structural variants
2.1 The impact of gene copy number on mRNA levels
The impact of gene copy number on mRNA levels CNV会改变mRNA水平
globally there is indeed an appreciable correlation between mRNA levels and gene copy number
Notably, however, for individual genes mRNA levels often deviated from the expected levels: that is, they were not halved when one gene copy was deleted, nor increased by a 3:2 ratio in a trisomic state. Further to this, not all genes with altered copy number displayed altered expression, and a small proportion even showed expression changes that were inverse to the copy-number alteration. mRNA水平的改变并不是简单的拷贝数加减。
调控机制可以通过上位相互作用或个体基因水平的自调节反馈机制缓冲复制数的变化,甚至可以通过作用于更大区域的补偿机制来缓冲。
虽然当SV与基因交叉时,可测量的效应通常最大,但许多eQTL可归因于基因间区的SV,这表明这些SV可能通过改变基因调控区域发挥作用。
2.2 Structural variants and regulatory elements
只影响基因顺式调控结构中离散部分的SV可能具有组织特异性或发育阶段特异性的影响,这可能导致基因组织特异性功能丧失。
结构变异除了可导致基因表达的缺失外,还可诱导过表达或异位表达。
2.3 Structural variants and protein expression levels
particularly in mammals, in which genome-wide no more than 40% of protein abundance can be explained by mRNA abundance.
Changes in mRNA levels caused by copy-number alterations do not necessarily result in corresponding changes in protein levels, as protein levels are additionally modulated by post-transcriptional regulation, translational control, protein folding and stability, and higher-order regulatory interactions between genes and proteins.
3. Linking structural variants and phenotypes
3.1 Ascertaining structural variant breakpoints and allelic status.
结构变异可以从涉及多个断点的复杂变化中产生,这种多断点也称为“chromothripsis”的灾难性染色体变化。
除了断点精度之外,了解结构变体的等位基因状态也非常重要。据估计,11%或更多loci含有共同的、可遗传的多等位基因;也就是说,各自的基因组位点经历了多个独立或同时发生的DNA重排事件。
3.2 Mapping minimal critical regions
在确定了受结构变异影响的基因组区域之后,剩下的挑战是确定结构变异内部或周围具有功能相关性的基因。
然而,最小临界区映射的分辨率取决于具有重叠表型和非典型基因组重排的患者的可及性。
3.3 Animal models of structural variants
鼠 狗 牛都可以作为模式生物。老鼠模型比较常见。
3.4 Computational approaches
Recently, advances have been made in two areas: first, in predicting the properties of disease-causing genes within structural variants; and second, in identifying networks and cellular processes that are disrupted in disease.
鉴定受结构变异影响的关键基因的初步工作集中于haploinsufficient genes(单倍剂量不足:指一个等位基因突变后,另一个等位基因能正常表达,但这只有正常水平50%的蛋白质不足以维持细胞正常的生理功能)。这类基因通常比单倍充足基因更长,并且表现出更高程度的进化保守性
4. Perspectives and future challenges
目前研究的局限:
1.Present studies are mostly directed to 'unique' regions of the genome, but are 'blind' towards漠视了 the phenotypic contribution of structural variants in complex, repeat-rich, highly duplicated areas of the genome, which are difficult, or even impossible, to ascertain using current genomics technologies.
2.越来越多的人正在使用基于外显子捕获的基因组测序来主要评估SNVs和indels,
尽管与全基因组测序相比,此类研究具有成本和周转时间较低的优势,但在结构变异映射方面,特别是对于小的变异、非编码基因组区域的变异和诸如易位等平衡的变异方面的研究还是有局限性的。
发展要求:
Correct and comparable genotype–phenotype correlation and interpretation is highly dependent on sample quality (which includes standardization of the clinical and phenotypic information of these cohorts), sequencing data production and computational analysis. 大型的测序项目, 如Deciphering Developmental Disorders and UK10K, are specifically devoting efforts to collect such data. 此外,the International Standards for Cytogenomic Arrays (ISCA) Consortium 也启动了一项whole-genome array database, which provides clinicians and researchers free access to a searchable catalogue of ranked disease-associated structural variants and affected genes, thus offering a fast and user-friendly interface to query a locus of interest.
研究新方向:
1.regulatory landscapes and the three-dimensional organization of the genome
2.ENCODE consortium now provides an unprecedented resource for scrutinizing the phenotypic effects of intergenic structural variants in a tissue-specific setting
3.体外培养疾病特异性诱导多能干细胞(iPSCs)的技术进步和体内染色体工程的进展,正在为重建致病变异(包括复制、倒置和易位)提供合适的模型系统
4.位点定向核酸酶的发展为产生靶向结构变异开辟了新的途径
5.模型生物中的大规模资源,如国际敲除小鼠联盟,提供了不断增长的功能缺失等位基因目录,以方便分析单个候选基因