为什么是70个字?
Transmission of short messages between the SMSC and the handset is done whenever using the Mobile Application Part (MAP) of the SS7 protocol. Messages are sent with the MAP MO- and MT-ForwardSM operations, whose payload length is limited by the constraints of the signaling protocol to precisely 140 octets (140 octets = 140 * 8 bits = 1120 bits). Short messages can be encoded using a variety of alphabets: the default GSM 7-bit alphabet, the 8-bit data alphabet, and the16-bit UCS-2 alphabet.[35] Depending on which alphabet the subscriber has configured in the handset, this leads to the maximum individual short message sizes of 160 7-bit characters, 140 8-bit characters, or 70 16-bit characters. GSM 7-bit alphabet support is mandatory for GSM handsets and network elements,[35] but characters in languages such as Arabic, Chinese, Korean, Japanese or Cyrillic alphabet languages (e.g. Russian, Serbian, Bulgarian, etc.) must be encoded using the 16-bit UCS-2 character encoding (see Unicode). Routing data and other metadata is additional to the payload size.
Larger content (concatenated SMS, multipart or segmented SMS, or "long SMS") can be sent using multiple messages, in which case each message will start with a user data header (UDH) containing segmentation information.Since UDH is part of the payload, the number of available characters per segment is lower: 153 for 7-bit encoding, 134 for 8-bit encoding and 67 for 16-bit encoding. The receiving handset is then responsible for reassembling the message and presenting it to the user as one long message. While the standard theoretically permits up to 255 segments,[36] 6 to 8 segment messages are the practical maximum, and long messages are often billed as equivalent to multiple SMS messages. Some providers have offered length-oriented pricing schemes for messages, however, the phenomenon is disappearing.
===========================
One way of sending concatenated SMS (CSMS) is to split the message into 153 7-bit character parts (134 octets), and sending each part with a User Data Header (UDH) tacked onto the beginning. A UDH can be used for various purposes and its contents and size varies accordingly, but a UDH for concatenating SMSes look like this:
It is possible to use a 16 bit CSMS reference number in order to reduce the probability that two different concatenated messages are sent with identical reference numbers to a receiver. In this case, the User Data Header shall be:
Example of the UDH for an sms split into two parts:
05 00 03 CC 02 01 [ message ] 05 00 03 CC 02 02 [ message ]
Note. if a UDH is present and the data encoding is the default 7-bit alphabet, the user data must be 7-bit word aligned after the UDH.[2] This means up to 6 bits of zeros need to be inserted at the start of the [message].
E.g. with a UDH containing a single part,
05 00 03 CC 01 01
the UDH is a total of (number of octets x bit size of octets) 6 x 8 = 48 bits long. Therefore a single bit of padding has to be prepended to the message. The UDH is therefore (bits for UDH / bits per septet) = (48 + 1)/7 = 7 septets in length.
With a message of "hello world", the [message] is encoded as
D0 65 36 FB 0D BA BF E5 6C 32
whereas without padding, the [message] would be
E8 32 9B FD 06 DD DF 72 36 19
and the UDL is 7 (header septets) + 11 (message septets) = 18 septets.
=======================================
超长短信:短信内容超过70个汉字,提交给网关时候需要分成多条,但是用户手机接收时候是一条(sp角度,手机发送长短信概念一样)。
在cmpp协议里,CMPP-_SUBMIT消息定义中有相应的参数配置:
TP_udhi :0代表内容体里不含有协议头信息 1代表内容含有协议头信息(长短信,push短信等都是在内容体上含有头内容的,也就是说把基本参数(TP-MTI/VFP)值设置成0X51)当设置内容体包含协议头,需要根据协议写入相应的信息,长短信协议头有两种:
6位协议头格式:05 00 03 XX MM NN
byte 1 : 05, 表示剩余协议头的长度
byte 2 : 00, 这个值在GSM 03.40规范9.2.3.24.1中规定,表示随后的这批超长短信的标识位长度为1(格式中的XX值)。
byte 3 : 03, 这个值表示剩下短信标识的长度
byte 4 : XX,这批短信的唯一标志(被拆分的多条短信,此值必需一致),事实上,SME(手机或者SP)把消息合并完之后,就重新记录,所以这个标志是否唯
一并不是很 重要。
byte 5 : MM, 这批短信的数量。如果一个超长短信总共5条,这里的值就是5。
byte 6 : NN, 这批短信的数量。如果当前短信是这批短信中的第一条的值是1,第二条的值是2。
例如:05 00 03 39 02 01
7位的协议头格式:06 08 04 XX XX MM NN
byte 1 : 06, 表示剩余协议头的长度
byte 2 : 08, 这个值在GSM 03.40规范9.2.3.24.1中规定,表示随后的这批超长短信的标识位长度为2(格式中的XX值)。
byte 3 : 04, 这个值表示剩下短信标识的长度
byte 4-5 : XX XX,这批短信的唯一标志,事实上,SME(手机或者SP)把消息合并完之后,就重新记录,所以这个标志是否唯一并不是很重要。
byte 6 : MM, 这批短信的数量。如果一个超长短信总共5条,这里的值就是5。
byte 7 : NN, 这批短信的数量。如果当前短信是这批短信中的第一条的值是1,第二条的值是2。
例如:06 08 04 00 39 02 01
到此,长短信的发送设置基本完成,但是有一点要注意:Src_Id 协议里这个字段在一条长短信中必须要一样,不然手机会解析成三条,
并三条都 是错误短信。
折腾了2天,对于Cmpp长短信,简单说,就是cmpp字段的pk_total和 pk_number需要设置,另外udhi需要设置为1,如果采用6字节的udhi,则把短信内容按134字节来分割,在头部添加6字节udhi,刚好 140字节。(这是一个边界点,有些cmpp服务器,必须小于140字节,等于也不行,你得自己试)。另外需要注意的是:UDHI头的后面三个字节,第一是顺序号,确保不重复就行,其他两个字节,和pk_total/pk_num一致;此外;这几条短信的seq_num也要一致。我用Nokia6630接收正常,手机会自动拼接成1条长短信。