今天在看<<Pro Php Security>>这本书的时候,有一部分是专门讲用户输入验证与净化(sanitize)这一内容的。
当用户输入的时候,有好几类字符得转换,最基本的是addslashes这样的转换
还有几点要考虑
(1)当用户输入要显示的内容时,可能有html或者javascript代码,对这样的代码转换比较简单
$string = str_replace(array("%3C",'<'), '<', $string);
$string = str_replace(array("%3E",'>'), '>', $string);
Character | Code Points (Hex) |
Code Points (Dec) |
Why encode? |
---|---|---|---|
Space | 20 | 32 | Significant sequences of spaces may be lost in some uses (especially multiple spaces) |
Quotation marks 'Less Than' symbol ("<") 'Greater Than' symbol (">") |
22 3C 3E |
34 60 62 |
These characters are often used to delimit URLs in plain text. |
'Pound' character ("#") | 23 | 35 | This is used in URLs to indicate where a fragment identifier (bookmarks/anchors in HTML) begins. |
Percent character ("%") | 25 | 37 | This is used to URL encode/escape other characters, so it should itself also be encoded. |
Misc. characters: Left Curly Brace ("{") Right Curly Brace ("}") Vertical Bar/Pipe ("|") Backslash ("/") Caret ("^") Tilde ("~") Left Square Bracket ("[") Right Square Bracket ("]") Grave Accent ("`") |
7B 7D 7C 5C 5E 7E 5B 5D 60 |
123 125 124 92 94 126 91 93 96 |
Some systems can possibly modify these characters. |
Char |
::= |
#x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] |
/* |
any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */ |