大家经常在文献中看到非常漂亮的多序列比对图,上面标注了各种蛋白二级结构的信息,现在小白将目前见过的最好看的序列比对图和蛋白二级结构的组合图的作图方法作分享,希望对大家的科研工作有所帮助,效果图如下:
网站的网址如下
https://espript.ibcp.fr/ESPript/cgi-bin/ESPript.cgi
示例数据用之前讲到的19条同源基因序列
>AST51816.1 Venus [Cloning vector pSTB205]
MVSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKLICTTGKLPVPWPTLVTTLGYGLQCFARYPDHMK
QHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHNVYITADKQKN
GIKANFKIRHNIEDGGVQLADHYQQNTPIGDGPVLLPDNHYLSYQSALSKDPNEKRDHMVLLEFVTAAGITLGMDELYKE
LLMCSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSRSKNRVNTVRKSSTTARCQVEGCRMDLSNVKAYYSRHKVCC
IHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTALFTSHYSRIAPSLYGNPNAAMIKS
VLGDPTAWSTARSVMQRPGPWQINPVRETHPHMNVLSHGSSSFTTCPEMINNNSTDSSCALSLLSNSYPIHQQQLQTPTN
TWRPSSGFDSMISFSDKVTMAQPPPISTHQPPISTHQQYLSQTWEVIAGEKSNSHYMSPVSQISEPADFQISNGTTMGGF
ELYLHQQVLKQYMEPENTRAYDSSPQHFNWSL
>NP_191351.1 squamosa promoter binding protein-like 15 [Arabidopsis thaliana]
MELLMCSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSRSKNRVNTVRKSSTTARCQVEGCRMDLSNVKAYYSRHKV
CCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTALFTSHYSRIAPSLYGNPNAAMI
KSVLGDPTAWSTARSVMQRPGPWQINPVRETHPHMNVLSHGSSSFTTCPEMINNNSTDSSCALSLLSNSYPIHQQQLQTP
TNTWRPSSGFDSMISFSDKVTMAQPPPISTHQPPISTHQQYLSQTWEVIAGEKSNSHYMSPVSQISEPADFQISNGTTMG
GFELYLHQQVLKQYMEPENTRAYDSSPQHFNWSL
>KAG7634825.1 SBP domain superfamily [Arabidopsis suecica]
MELLMGSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSRSKNRVNTVRKSSTTARCQVEGCRMDLSNVKAYYSRHKV
CCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTALFTSHYSRIAPSLYGNPNAAMI
KSVLGDPTAWSTARSVMQRPGPWQINPVRETHPHMNVLSHGSSSFTTCPEMINNNSTDSSCALSLLSNSYPIHQQQLQTP
TNTWRPSSGFDSMISFSDKVTMAQPPPISTHQPPISTHQQYLSQTWEVIAGEKSNSHYMSPVSQISEPADFQISNGTTMG
GFELYLHQQVLKQYMEPENTRAYDSSPQHFNWSL
>CAA0387110.1 unnamed protein product [Arabidopsis thaliana]
MELLMGSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSRSKNRVNTVRKSSTTARCQVEGCRMDLSNVKAYYSRHKV
CCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTALFTSHYSRIAPSLYGNPNAAMI
KSVLGDPTAWSTARSVMQRPGPWQINPVRETHPHMNVLSHGSSSFTTCPEMINNNSTDSSCALSLLSNSYPIHQQQLQTP
TNTWRPSSGFDSMISFSDKVTMAQPPPISTHQPPISTHQQYLSQTWEVIAGEKSNSHYMSPVSQISEPVDFQISNGTTMG
GFELYLHQQVLKQYMEPENTRAYDSSPQHFNWSL
>CAD5326126.1 unnamed protein product [Arabidopsis thaliana]
MELLMGSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSRSKNRVNTVRKSSTTARCQVEGCRMDLSNVKAYYSRHKV
CCIHSKSSKVIVSGLHQRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTALFTSHYSRIAPSLYGNPNAAMIKSVLGDP
TAWSTARSVMQRPGPWQINPVRETHPHMNVLSHGSSSFTTCPEMINNNSTDSSCALSLLSNSYPIHQQQLQTPTNTWRPS
SGFDSMISFSDKVTMAQPPPISTHQPPISTHQQYLSQTWEVIAGEKSNSHYMSPVSQISEPVDFQISNGTTMGGFELYLH
QQVLKQYMEPENTRAYDSSPQHFNWSL
>KAG7561265.1 SBP domain superfamily [Arabidopsis thaliana x Arabidopsis arenosa]
MELLMGSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSGSKNRVNTGRKSTMTARCQVEGCRMDLSNVKAYYSRHKV
CCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTALFTSRYSRIAPSLYGNPNAAMI
KSVLGDPMAWSTAKSVMRRSGPWQINPERESHQLLNVLSHGSSSFTTCPEIINNNSTDSSCALSLLSNSNPIQQQQLQTP
TNLWRPSSGFDSLISFSDRVTMAQPPPISTHHQYLSQTWEVMAGEKSNSHYISPVSQISEPADFQISNGTTMGGFELSLH
QQVLRQYMEPENTRAYDSSPQHFNWSL
>XP_002878178.1 squamosa promoter-binding-like protein 15 [Arabidopsis lyrata subsp. lyrata]
MELLMGSGQAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSGSKNRVNTGRKSTMTARCQVEGCRMDLSNVKAYYSRHKV
CCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTALFTSRYTRIAPSLYGNPNAAMI
KSVLGDPTAWSTARSVMRRSGPWQINPERESHQIMNVLSHGSSSFTTCPEITNNNSTDSSCALSLLSNSNPIQQQQLQTP
TNLWRPSSGFDSMISFSDRVTMAQPPPISTHHQYLSQTWDVMAGGKSNSHYMSPVSQISEPAEFQISNGTTMGGFELSLH
QQVLRQYMEPENTRAYDSSPQHFNWSL
>KAG7566101.1 SBP domain [Arabidopsis suecica]
MELLMGSGHAESGGSSSTESSSLSGGLRFGQKIYFEDGSGSGSKNRVNTGRKSTMTARCQLEGCRMDLSNVKAYYSRHKV
CCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTSSLFTSRYSRIAPSLYGNPNAAMI
KSVLGDPMAWSTAKSVMRRSGPWQINPERESHQLLNVLSHGSSSFTTCPEIINNNSTDSSCALSLLSNSNPIQQQQLQTP
TNLWRPSSGFDSLISFSDRVTMAQPPPISTHHQYLSQTWEVMAGEKSNSHYISPVSQISEPAGFQISNGTTMGGFELSLH
QQVLRQYMEPENTRAYDSSPQHFNWSL
>CAE6076605.1 unnamed protein product [Arabidopsis arenosa]
MRRGRGKGKRQNATAREDRGSGEEEKIPAFRRRGRPQKPVKDEIEEEEVELVKKTEEEEDKDDDTNGSVTSKEDVTENGR
KRKKPVESKESNITEEENGVGSKSSTEDSMKSSSSIGFRQNGSRRKNKPRRAAEAVVECNGAESGGSSSTESSSLSGGLR
FGQKIYFEDGSGSGSKNRVNTGRKSTMTARCQVEGCRMDLSNVKAYYSRHKVCCIHSKSSKVIVSGLHQRFCQQCSRFHQ
LSEFDLEKRSCRRRLACHNERRRKPQSTTSLFTSRYSRIAPSLYGNPNAAMIKSVLGDPMAWSTAKSVMRRSGPWQINPE
RESHQLLNVLSHGSSSFTTCPEIINNNSTDSSCALSLLSNSNPIQQQQLQTPTNLWRPSSGFDSLISFSDRVTMAQPPPI
STHHQYLSQTWEVMAGEKSNSHYISPVSQISEPADFQISNGSTMGGFELSLHQQVLRQYMEPENTRAYDSSPQHFNWSL
>XP_006291402.1 squamosa promoter-binding-like protein 15 [Capsella rubella]
MELLMGSGQAESGGSSSTESSLLSGGLRFGQKIYFEDGSGSGSKNRVSTGHKSSMTTVARCQVEGCKMDLSNAKAYYSRH
KVCCIHSKSSKVIVSGLHQRFCQQCSRFHHLSEFDLEKRSCRRRLACHNERRRKPQPATLFTSHYTRIAPSLYGNANAAM
IKSVLGDPTAWSTSRSVMRSSGPWQINPVKESNQLMNVYSQESSSFTITCPEMMNNNSTDSGCALSLLSNSNPIQQQQQQ
PQTQTNIWRSSSGFDSMILDRVTMAQPPPISGHHQYLNQTLAFMAGEKSNSHYMSPVLGPSQISEPDEFQISNGTTMDGF
ELSLHQQVLRQYMEPENTRAYDSSPHYFNWSL
>CAH2063751.1 unnamed protein product [Thlaspi arvense]
MELLMGSGQNRTESYGSSSTESSSLSGGLRFGQKIYFEDGSGSGGGSNKNRVNTGRKSRTARCQVEGCRMDLSNVKTYYS
RHKVCCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQATTSLLTSRYSRIAPSLYGNAN
TAMIRSVLGDPTAWSTARSVMRRSAPWQINPERESHQLMNVFSHDSSSFTTTCPEMMNSNGTDSSCALSLLSNSNTNQQQ
QLLQTSTNIWRPSSGFDSANADRATMAQPPPVSNQHQYLNQTWEFMAGEKSNSHYLSPVLGLSQISEPVDFQISNGTTMG
GFELSIHQQVLRHYMEPENTRAYDSSAQHFNWSL
>XP_010516431.1 PREDICTED: squamosa promoter-binding-like protein 15 [Camelina sativa]
MELLIGGSGQTESGGASSTKSSSLSGGLRFGQKIYFEDGSGSGSKNRVGTGHKSSTTTTTARCQVEGCKMDLSNAKAYYS
RHKVCCIHSKSSKVIVSGLRQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTLYTSQYTRIAPSLYGDANA
AMMKSVLGDPTVWSTARSVMRRSGPWQISPVKESHHQLMNVFSQESSSFTITCPEMMNNNSTDSSCALSLLSNSNSNSNP
IQQQQQQLQTQTHIWRPSLGFDSMTVDRVTMAQPPPISSHHQYLNQTLEFMAGEKSSSHYMSPVLGPSQISEPDEFQISN
GTTMDGFELSLHQQVLRQYMEPENTRAYDSSPHHFNWSL
>AKC05620.1 squamosa promoter-binding-like protein 15 [Cardamine hirsuta]
MELLMGSGQSESGASSSNESSSLSGGLRFGQKIYFEDGSGSGSKNRVSSTGRKSSTTTARCQVEGCRMDLSNAKTYYSRH
KVCCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPATTLFTSRFTRTAPSHYGNANAA
MIKSVLGDPTAWTAERSVMRRSAPWQSNPSHQVMIDFSHGSSSLTTTCPEMMNNTSTDSSCALSLLSNSNQTQQLQQQLQ
TPANIWRASSGFDSMIADRVTMAQPPPISTHHQYLNQSWEFMPGEKNDSHYMSPMSQISEPADLHMRNRTTMGGFEVSLH
QQVMRQYMAPENTRAYDSSPQHFNWSL
>XP_010504729.1 PREDICTED: squamosa promoter-binding-like protein 15 [Camelina sativa]
MELLMGGSGQTESGGASSTESSSLSGGLRFGQKIYFEDGSGSGSKNRVGAGHKSSTTARCQVEGCKMDLSNAKAYYSRHK
VCCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTLYTRIAASLYGNANAAMIKSVL
GDPTVWSTARSVMRRSGPWQINPVKESHHQHMNVFSQESSSFTITCPEMMNNNSTDSSCALSLLSNSNSNPIQQQQQQLQ
TQTNIWRPSSGFDYMTVDRVTLAQPPPIPSHHQYLNQTLEFMTGEKNSSHYMSPALGPSQISAPDEFQISNGTTMDGFEL
SLHQQVLRQYMAPENTRAYDSSPHHFNWSL
>CAA7060637.1 unnamed protein product [Microthlaspi erraticum]
MELLMDSSQTESGGSSSIESSSLTGGLRFGQKIYFEDGSGSGAKSSKNRVNTARKSSTSTARCQVEGCRMDLSNAKTYYS
RHKVCCIHSKSSNVIVSGLHQRFHLLSEFDLEKRSCRRRLACHNERRRKPHATTNLLTSRYSRIAPSLYENANTAIFRSV
LGDTTAWSAARPVMRRSGPWQINPERESNLNVFSHGSSSFTTCPAMMNNNSTDSSCALSLLSNSNTNTNQQQQQPLQTST
DTWRPSSGFDSMIADRVTMAQPPPVSIHNQYLNQSWDFMEGEKSNSHHMSPVLGLSQISEPADFQLSNGMGGGFELSLHQ
QVLKQYMEPENTRAYDSSPQHFNWSL
>KAG2324838.1 hypothetical protein Bca52824_007566 [Brassica carinata]
MELLMGSGQDHPQSAGSSSTLSGGLRFGQKIYFEDGSGAGLSRNRVNNTGRKSMTARCQVEGCRMDLSNAKTYYSRHKVC
CVHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQTTTTLLTSHYSSIAPSLYGNAIRSVLG
DPTLWSTARGSSAPWQINPERESHHQLMNIISFGSSSFTNSTDSSCALSLLSNSNRNQQEQQPLQTPTNAWRPSLDFDSI
VADRVTMAQPPPVSIQNQYLNQTWEFMSGEKSNAHCISPVLGLSQISEPVDFQTSNGATMSGVELSLHQQVLRQYLEPEN
TRAYDSSHQHFNWSL
>CAH8384605.1 unnamed protein product [Eruca vesicaria subsp. sativa]
MELEMGSGQKKPESAGSSSTLSGGLRFGQKIYFEDGSGAGLSKNRVSSTGRKSMTARCQVEGCRTDLSNAKTYYSRHKVC
CVHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQATTTLLTSRYSSLYGNAIRSVLGDPTT
WSTARGSAPWKINQESDRHQLMNVISFGSSSFTTCPEMMNNNSTDSSCALSLLSNSNPNQQEQQPLQTSNTIWRPSLDFD
STVADRVTMAQPPPVSMQNQYLNQTWEFMSGEKSNAQCISPVLGQSQISEPVDFQIGTTMGGGFELSLHQQVLRQYMEPE
NTRAYDTSPQYFNWSL
>KAF8114775.1 hypothetical protein N665_0034s0114 [Sinapis alba]
MELLMGSGQNQPESAGSSSSTLSGGLRFGQKIYFEDGSGAGLSKNRVNTGRKSTTARCQVEGCRMDLSSAKTYYSRHKVC
CIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQATTTFLTSHYSSIAPSLYGNAIRGVLG
DSTTWSTARGSAPLQINPERESHRLMNVFSFGSSSFTNNSTDSSCALSLLSNSNPNQQEQQPLQTPTNTWRPSLDFDSIV
ADRVTMAQPPPVSVQNQYLNQTWEFMSGEKSNGQHYISPVLGLSQISEPVDFQISNGATMSGVELSLHQQVLRQYLEPEN
TRAYDSSPQHFNWSL
>XP_010427684.1 PREDICTED: squamosa promoter-binding-like protein 15 [Camelina sativa]
MELLMGGTESGGASSTESSSLSGGLRFGQKIYFEDGSGSGSKNRVVTGHKSSTTTTTARCQVEGCKMDLSNAKAYYSRHK
VCCIHSKSSKVIVSGLHQRFCQQCSRFHQLSEFDLEKRSCRRRLACHNERRRKPQPTTLFTSHYTRIAPSLYGNANAAMI
KSVLGDPTVWSTARSVMRRSGPWQINPVKESHHQLMNVFSQESSSFTITCPEMMNNNNSTDSSCALSLLSNSNSNPIQQQ
QQQLQTQTNIWRPSLGFDSMTVDRVTLAQPPPILSHHQYMSPVLGPSQISAPDEFQISNVTTMDGFELSLHQQVLRQYME
PQNTRAYDSSPHHFNWSL
在用该网站之前需要把序列进行比对,用MEGA7就能完成,把第一条序列用蛋白同源建模
https://swissmodel.expasy.org/
得到建模结果选择最佳的模型,下载PDB文件
现在进入ESPript / ENDscript网站,上传对应文件,采用初级 的默认功能进行图片可视化,再点击页面顶部的SUBMIT按钮即可。
这是以第一条蛋白序列作为模型构建的PDB文件,序列跟其他的蛋白序列差异比较大,建议选择同源性高的序列进行可视化,效果会更好。
生信漫谈,小知识,大智慧!
欢迎大家一起讨论学习!
生信漫谈
生信漫谈,认识生信,学习生信,跨越生信入门路上的障碍,从而利用生信技术解决科研学习路上的绊脚石!