rosalind练习题二十一

# Problem
# A DNA string is a reverse palindrome if it is equal to its reverse complement. For instance, GCATGC is a reverse palindrome because its reverse complement is GCATGC. See Figure 2.

# Given: A DNA string of length at most 1 kbp in FASTA format.
# Return: The position and length of every reverse palindrome in the string having length between 4 and 12. You may return these pairs in any order.

# Sample Dataset
# >Rosalind_24
# TCAATGCATGCGGGTCTATATGCAT
# Sample Output
# 4 6
# 5 4
# 6 6
# 7 4
# 17 4
# 18 4
# 20 6
# 21 4

# 这道题目给定一个长度不超过1000bp的DNA序列,要求找到其中所有的反向回文序列,然后输出它们在原序列中的位置和长度。反向回文序列指的是与其反向互补序列相等的序列,比如GCATGC就是一个反向回文序列,因为它的反向互补序列也是GCATGC。需要注意的是,找到的所有反向回文序列的长度必须在4到12之间,并且输出的顺序可以任意安排。

from Bio import SeqIO

def reverse_complement(s):
    return s.translate(str.maketrans("ATCG", "TAGC"))[::-1]

def find_reverse_palindromes(seq):
    res = []
    for i in range(len(seq)):
        for j in range(4, 13):
            if i+j > len(seq):
                break
            subseq = seq[i:i+j]
            if subseq == reverse_complement(subseq):
                res.append((i+1, j))
    return res

# 读取FASTA文件
record = SeqIO.read("input6.fasta", "fasta")
seq = str(record.seq)

# 找到所有的反向回文序列
res = find_reverse_palindromes(seq)

# 输出结果
for pos, length in res:
    print(pos, length)
 

你可能感兴趣的:(python)