2022-09-12 ROSALIND第五题 计算GC含量并返回最大值

# 首先将文件逐行读入并且去除每一行末尾的换行符 \n
with open("ROSALIND_FASTA.txt") as file:
    FASTAseq = file.readlines()     
    FASTAseq = [x.rstrip() for x in FASTAseq] # 列表解析时去除末尾换行符\n
image.png
#将FASTAseq切片分割
def FASTAseq_new(seq): 
    # 获取ROSname
    myROSname = [x[1:]for x in FASTAseq if "Rosalind" in x]  # 模糊查找
    
    # 遍历列表所有元素,返回每一个ROSALIND的索引
    myROSindex = [FASTAseq.index(x) for x in myROSname if x in FASTAseq] 
    
    for i in range(len(myROSindex)):
        # 添加一个判断语句:如果循环到myROSindex最后一个元素,则更改list切片规则
        if myROSindex[i] !=  myROSindex[-1]:
            new_fastaseq.append(FASTAseq[int(myROSindex[i]):int(myROSindex[i+1])])  
        else: 
            new_fastaseq.append(FASTAseq[int(myROSindex[i]):])
     
    # 测试一下函数写的对错       
    # return new_fastaseq  
    
    for j in range(len(new_fastaseq)):
        new_fastaseq2.append("".join(new_fastaseq[j][1:]))
        
        GC.append((int(new_fastaseq2[j].count("C")) + int(new_fastaseq2[j].count("G")))/len(new_fastaseq2[j])*100)
    # GC = [float('{:.3f}'.format(i)) for i in GC] 保留小数点的方式,不知道为什么加上了之后会报错
    
    GCindex = GC.index(max(GC))
    # GCdict = dict(zip(myROSname,GC))   可以创建字典,然后遍历,我用了最直接的方式
     
    print(myROSname[GCindex])
    print(max(GC))
        

image.png

你可能感兴趣的:(2022-09-12 ROSALIND第五题 计算GC含量并返回最大值)