sam文件学习1

1.FLAG说明

 Each bit in the FLAG field is defined as:
0x0001	p	the read is paired in sequencing
0x0002	P	the read is mapped in a proper pair
0x0004	u	the query sequence itself is unmapped
0x0008	U	the mate is unmapped
0x0010	r	strand of the query (1 for reverse)
0x0020	R	strand of the mate
0x0040	1	the read is the first read in a pair
0x0080	2	the read is the second read in a pair
0x0100	s	the alignment is not primary
0x0200	f	the read fails platform/vendor quality checks
0x0400	d	the read is either a PCR or an optical duplicate
0x0800	S	the alignment is supplementary

where the second column gives the string representation of the FLAG field. 

2.理解:0x为16进制位,每一个代表一个特定的意思

3.实例:
read1:

@chrUn_KN707963v1_decoy_19393_19870_2:0:0_0:0:0_0/1
CCATTTGATTCCATTCCTTTGGATTCCATTCCATTGTATTGGATTGCATTGGATTCCATTCCATTCTATT
+
2222222222222222222222222222222222222222222222222222222222222222222222


read2:


@chrUn_KN707963v1_decoy_19393_19870_2:0:0_0:0:0_0/2
TCAAAGGGAATAGAATCGAATGAAATAGAATCTAATGGAATGGAATGGAATGGAATGGAATGGAATGGAA
+
2222222222222222222222222222222222222222222222222222222222222222222222

匹配的SAM(部分):

@SQ	SN:HLA-DRB1*15:03:01:01	LN:11567
@SQ	SN:HLA-DRB1*15:03:01:02	LN:11569
@SQ	SN:HLA-DRB1*16:02:01	LN:11005
@PG	ID:bwa	PN:bwa	VN:0.7.13-r1126	CL:bwa sampe ../hs38DH.fa hs38DHPE1L100F1.sai hs38DHPE1L100F2.sai hs38DHPE1L100F1.fq hs38DHPE1L100F2.fq
chrUn_KN707963v1_decoy_19393_19870_2:0:0_0:0:0_0	99	chrUn_KN707963v1_decoy	19393	60	70M	=	19801	478	CCATTTGATTCCATTCCTTTGGATTCCATTCCATTGTATTGGATTGCATTGGATTCCATTCCATTCTATT	2222222222222222222222222222222222222222222222222222222222222222222222	XT:A:U	NM:i:2	SM:i:37	AM:i:37	X0:i:1	X1:i:0	XM:i:2	XO:i:0	XG:i:0	MD:Z:40C4C24
chrUn_KN707963v1_decoy_19393_19870_2:0:0_0:0:0_0	147	chrUn_KN707963v1_decoy	19801	60	70M	=	19393	-478	TTCCATTCCATTCCATTCCATTCCATTCCATTCCATTAGATTCTATTTCATTCGATTCTATTCCCTTTGA	2222222222222222222222222222222222222222222222222222222222222222222222	XT:A:U	NM:i:0	SM:i:37	AM:i:37	X0:i:1	X1:i:0	XM:i:0	XO:i:0	XG:i:0	MD:Z:70

1表示the read is paired in sequencing

其中147表示read2的FLAG,147=128+16+2+1

128表示:the read is the second read in a pair

16表示:strand of the query (1 for reverse)

表示查询序列是反的,

原来产生的序列为:
TCAAAGGGAATAGAATCGAATGAAATAGAATCTAATGGAATGGAATGGAATGGAATGGAATGGAATGGAA  
匹配后的序列为:
TTCCATTCCATTCCATTCCATTCCATTCCATTCCATTAGATTCTATTTCATTCGATTCTATTCCCTTTGA	


可以看出,两条序列是反向匹配的,TCAAAGGG匹配第二条后面开始的AGTTTCCC。。。


你可能感兴趣的:(生物信息,Perl,生物信息,数据库,论坛)