Genebank文件.gbk转fasta格式文件

这里用到python中Biopython包完成。


步骤:
1.安装Biopython包,在anaconda prompt中使用pip install biopython或在Spyder中使用!pip install biopython安装;
2.gbk文件转核酸序列

from Bio import SeqIO
gbk_filename = "c00079_GUT_GEN...region001.gbk"#.gbk文件名
faa_filename = "c00079_GUT_GEN...region001.fna"#输出文件名
input_handle  = open(gbk_filename, "r")
output_handle = open(faa_filename, "w")

for seq_record in SeqIO.parse(input_handle, "genbank") :
    print("Dealing with GenBank record %s" % seq_record.id)
    output_handle.write(">%s %s\n%s\n" % (
           seq_record.id,
           seq_record.description,
           seq_record.seq))

output_handle.close()
input_handle.close()

3.gbk文件转蛋白序列

from Bio import SeqIO
gbk_filename = "NC_005213.gbk"
faa_filename = "NC_005213_converted.faa"
input_handle  = open(gbk_filename, "r")
output_handle = open(faa_filename, "w")

for seq_record in SeqIO.parse(input_handle, "genbank") :
    print("Dealing with GenBank record %s" % seq_record.id)
    for seq_feature in seq_record.features :
        if seq_feature.type=="CDS" :
            assert len(seq_feature.qualifiers['translation'])==1
            output_handle.write(">%s from %s\n%s\n" % (
                   seq_feature.qualifiers['locus_tag'][0],
                   seq_record.name,
                   seq_feature.qualifiers['translation'][0]))

output_handle.close()
input_handle.close()

你可能感兴趣的:(Genebank文件.gbk转fasta格式文件)