长短信格式

为什么是70个字?

Message size

Transmission of short messages between the SMSC and the handset is done whenever using the Mobile Application Part (MAP) of the SS7 protocol. Messages are sent with the MAP MO- and MT-ForwardSM operations, whose payload length is limited by the constraints of the signaling protocol to precisely 140 octets (140 octets = 140 * 8 bits = 1120 bits). Short messages can be encoded using a variety of alphabets: the default GSM 7-bit alphabet, the 8-bit data alphabet, and the16-bit UCS-2 alphabet.[35] Depending on which alphabet the subscriber has configured in the handset, this leads to the maximum individual short message sizes of 160 7-bit characters, 140 8-bit characters, or 70 16-bit characters. GSM 7-bit alphabet support is mandatory for GSM handsets and network elements,[35] but characters in languages such as Arabic, Chinese, Korean, Japanese or Cyrillic alphabet languages (e.g. Russian, Serbian, Bulgarian, etc.) must be encoded using the 16-bit UCS-2 character encoding (see Unicode). Routing data and other metadata is additional to the payload size.

Larger content (concatenated SMS, multipart or segmented SMS, or "long SMS") can be sent using multiple messages, in which case each message will start with a user data header (UDH) containing segmentation information.Since UDH is part of the payload, the number of available characters per segment is lower: 153 for 7-bit encoding, 134 for 8-bit encoding and 67 for 16-bit encoding. The receiving handset is then responsible for reassembling the message and presenting it to the user as one long message. While the standard theoretically permits up to 255 segments,[36] 6 to 8 segment messages are the practical maximum, and long messages are often billed as equivalent to multiple SMS messages. Some providers have offered length-oriented pricing schemes for messages, however, the phenomenon is disappearing.

 

 ===========================

 

Sending a concatenated SMS using a User Data Header

One way of sending concatenated SMS (CSMS) is to split the message into 153 7-bit character parts (134 octets), and sending each part with a User Data Header (UDH) tacked onto the beginning. A UDH can be used for various purposes and its contents and size varies accordingly, but a UDH for concatenating SMSes look like this:

  • Field 1 (1 octet): Length of User Data Header, in this case 05.
  • Field 2 (1 octet): Information Element Identifier, equal to 00 (Concatenated short messages, 8-bit reference number)
  • Field 3 (1 octet): Length of the header, excluding the first two fields; equal to 03
  • Field 4 (1 octet): 00-FF, CSMS reference number, must be same for all the SMS parts in the CSMS
  • Field 5 (1 octet): 00-FF, total number of parts. The value shall remain constant for every short message which makes up the concatenated short message. If the value is zero then the receiving entity shall ignore the whole information element
  • Field 6 (1 octet): 00-FF, this part's number in the sequence. The value shall start at 1 and increment for every short message which makes up the concatenated short message. If the value is zero or greater than the value in Field 5 then the receiving entity shall ignore the whole information element. [ETSI Specification: GSM 03.40 Version 5.3.0: July 1996]

It is possible to use a 16 bit CSMS reference number in order to reduce the probability that two different concatenated messages are sent with identical reference numbers to a receiver. In this case, the User Data Header shall be:

  • Field 1 (1 octet): Length of User Data Header (UDL), in this case 6.
  • Field 2 (1 octet): Information Element Identifier, equal to 08 (Concatenated short messages, 16-bit reference number)
  • Field 3 (1 octet): Length of the header, excluding the first two fields; equal to 04
  • Field 4 (2 octets): 0000-FFFF, CSMS reference number, must be same for all the SMS parts in the CSMS
  • Field 5 (1 octet): 00-FF, total number of parts. The value shall remain constant for every short message which makes up the concatenated short message. If the value is zero then the receiving entity shall ignore the whole information element
  • Field 6 (1 octet): 00-FF, this part's number in the sequence. The value shall start at 1 and increment for every short message which makes up the concatenated short message. If the value is zero or greater than the value in Field 5 then the receiving entity shall ignore the whole information element. [ETSI Specification: GSM 03.40 Version 5.3.0: July 1996]

Example of the UDH for an sms split into two parts:

05 00 03 CC 02 01 [ message ] 
05 00 03 CC 02 02 [ message ]

Note. if a UDH is present and the data encoding is the default 7-bit alphabet, the user data must be 7-bit word aligned after the UDH.[2] This means up to 6 bits of zeros need to be inserted at the start of the [message].

E.g. with a UDH containing a single part,

05 00 03 CC 01 01

the UDH is a total of (number of octets x bit size of octets) 6 x 8 = 48 bits long. Therefore a single bit of padding has to be prepended to the message. The UDH is therefore (bits for UDH / bits per septet) = (48 + 1)/7 = 7 septets in length.

With a message of "hello world", the [message] is encoded as

D0 65 36 FB 0D BA BF E5 6C 32

whereas without padding, the [message] would be

E8 32 9B FD 06 DD DF 72 36 19

and the UDL is 7 (header septets) + 11 (message septets) = 18 septets.

 

=======================================

 

超长短信:短信内容超过70个汉字,提交给网关时候需要分成多条,但是用户手机接收时候是一条(sp角度,手机发送长短信概念一样)。

在cmpp协议里,CMPP-_SUBMIT消息定义中有相应的参数配置:
TP_udhi :0代表内容体里不含有协议头信息 1代表内容含有协议头信息(长短信,push短信等都是在内容体上含有头内容的,也就是说把基本参数(TP-MTI/VFP)值设置成0X51)当设置内容体包含协议头,需要根据协议写入相应的信息,长短信协议头有两种:
6位协议头格式:05 00 03 XX MM NN
byte 1 : 05, 表示剩余协议头的长度
byte 2 : 00, 这个值在GSM 03.40规范9.2.3.24.1中规定,表示随后的这批超长短信的标识位长度为1(格式中的XX值)。
byte 3 : 03, 这个值表示剩下短信标识的长度
byte 4 : XX,这批短信的唯一标志(被拆分的多条短信,此值必需一致),事实上,SME(手机或者SP)把消息合并完之后,就重新记录,所以这个标志是否唯
一并不是很 重要。

byte 5 : MM, 这批短信的数量。如果一个超长短信总共5条,这里的值就是5。
byte 6 : NN, 这批短信的数量。如果当前短信是这批短信中的第一条的值是1,第二条的值是2。
例如:05 00 03 39 02 01


 

包头一共6个字节,如下:
1 、字节一:包头长度,固定填写0x05;
2 、字节二:包头类型标识,固定填写0x00,表示长短信;
3 、字节三:子包长度,固定填写0x03,表示后面三个字节的长度;
4 、字节四到字节六:包内容:
1 )字节四:长消息参考号,每个SP给每个用户发送的每条参考号都应该不同,可以从0开始,每次加1,最大255,便于同一个终端对同一个SP的消息的不同的长短信进行识别;
2 )字节五:本条长消息的的总消息数,从1到255,一般取值应该大于2;
3 )字节六:本条消息在长消息中的位置或序号,从1到255,第一条为1,第二条为2,最后一条等于第四字节的值。

7位的协议头格式:06 08 04 XX XX MM NN
byte 1 : 06, 表示剩余协议头的长度
byte 2 : 08, 这个值在GSM 03.40规范9.2.3.24.1中规定,表示随后的这批超长短信的标识位长度为2(格式中的XX值)。
byte 3 : 04, 这个值表示剩下短信标识的长度
byte 4-5 : XX XX,这批短信的唯一标志,事实上,SME(手机或者SP)把消息合并完之后,就重新记录,所以这个标志是否唯一并不是很重要。
byte 6 : MM, 这批短信的数量。如果一个超长短信总共5条,这里的值就是5。
byte 7 : NN, 这批短信的数量。如果当前短信是这批短信中的第一条的值是1,第二条的值是2。
例如:06 08 04 00 39 02 01

到此,长短信的发送设置基本完成,但是有一点要注意:Src_Id 协议里这个字段在一条长短信中必须要一样,不然手机会解析成三条,
并三条都 是错误短信。

折腾了2天,对于Cmpp长短信,简单说,就是cmpp字段的pk_total和 pk_number需要设置,另外udhi需要设置为1,如果采用6字节的udhi,则把短信内容按134字节来分割,在头部添加6字节udhi,刚好 140字节。(这是一个边界点,有些cmpp服务器,必须小于140字节,等于也不行,你得自己试)。另外需要注意的是:UDHI头的后面三个字节,第一是顺序号,确保不重复就行,其他两个字节,和pk_total/pk_num一致;此外;这几条短信的seq_num也要一致。我用Nokia6630接收正常,手机会自动拼接成1条长短信。

你可能感兴趣的:(header,constraints,byte,reference,sms,encoding)