URL中如果出现非ASCII字符时需要进行编码(encode)

当URL中的字符是什么的时候需要编码(encode)呢? 如下所述:

characters used in URLs must come from a fixed subset of ASCII, specifically:

  • The capital letters A-Z

  • The lowercase letters a-z

  • The digits 0-9

  • The punctuation characters - _ . ! ~ * ' (and ,)

The characters : / & ? @ # ; $ + = and % may also be used, but only for their specified purposes. If these characters occur as part of a filename, they and all other characters should be encoded.

The encoding is very simple. Any characters that are not ASCII numerals, letters, or the punctuation marks specified earlier are converted into bytes and each byte is written as a percent sign followed by two hexadecimal digits. Spaces are a special case because they're so common. Besides being encoded as %20, they can be encoded as a plus sign (+). The plus sign itself is encoded as %2B. The / # = & and ? characters should be encoded when they are used as part of a name, and not as a separator between parts of the URL.

你可能感兴趣的:(url,each,byte,encoding)