利用ANARCI标识CDR区域

brief

总有一些朋友丢给我一些抗体蛋白序列,希望我把抗体的框架区和高变区标识出来。
然后ANARCI 可以对抗蛋白序列的氨基酸进行编号和allign。
所以我想ANARCI可以解决这个问题。

安装

github开源软件:
https://github.com/oxpig/ANARCI

也有网页版的:
https://opig.stats.ox.ac.uk/webapps/sabdab-sabpred/sabpred/anarci/

conda install biopython
conda install hmmer

python3 setup.py install
ANARCI -h

使用

head ./cut.seq.fa
ANARCI -i ./cut.seq.fa --scheme imgt --csv -o 20230823 # 产生一个 20230823_H.csv文件
# 根据编号将Framework & CDR 分开
python3 ./user_script/print_H.py  ./20230823_H.csv > ./result.cut.txt

利用ANARCI标识CDR区域_第1张图片
在这里插入图片描述

cat   ./user_script/print_H.py
#!/public/home/djs/miniconda3/bin/python

# DATE:20230302
# AUTHOR:JiangshanDai

import sys


file = sys.argv[1]
name = []
FW1 = []
CDR1 = []
FW2 = []
CDR2 = []
FW3 = []
CDR3 = []
FW4 = []

with open(file,"r") as f:
    next(f)
    for line in f.readlines():
        name.append(line.split(",")[0])
        FW1.append("".join(line.split(",")[13:39]).replace("-",""))
        CDR1.append("".join(line.split(",")[39:51]).replace("-",""))
        FW2.append("".join(line.split(",")[51:68]).replace("-",""))
        CDR2.append("".join(line.split(",")[68:81]).replace("-",""))
        FW3.append("".join(line.split(",")[81:120]).replace("-",""))
        CDR3.append("".join(line.split(",")[120:-11]).replace("-",""))
        FW4.append("".join(line.split(",")[-11:]).replace("-",""))

context = zip(name,FW1,CDR1,FW2,CDR2,FW3,CDR3,FW4)

for i in context:
    print(i[0],i[1],i[2],i[3],i[4],i[5],i[6],i[7])

你可能感兴趣的:(肿瘤与免疫,bioinfo,linux,linux)