目录
1、在ThinkPHP 验证邮箱地址的函数。
2、在PHP源代码中查看该函数实现
3、提取之后编写个PHP测试脚本进行测试。
下载了ThinkPHP 5.1源代码,地址:https://github.com/HubQin/thinkPHP5.1-source-code-reading
在里面搜索“email”(忽略大小写),结果如下:
然后分别搜索"filter"和"FILTER_VALIDATE_EMAIL"。
filter的结果太多了,好像没有有用的信息。
在看看FILTER_VALIDATE_EMAIL的结果:
只有一条记录。
然后去Google去搜索“PHP FILTER_VALIDATE_EMAIL”。
去W3school看看使用示例:
filter_var是什么函数,怎么直接调用就可以了。在TP5.1中的搜索“filter_var”不存在,那就继续Google。
但都是些用法,没有看到其内部实现。
考虑看看PHP源代码。于是下载了PHP 5.4.45的源代码。下载地址:https://www.php.net/distributions/php-5.4.45.tar.bz2
在源代码中搜索“FILTER_VALIDATE_EMAIL”,很多结果。关键看该函数的内部实现。
FILTER_VALIDATE_EMAIL旁边的标志有个"php_filter_validate_email",于是继续搜索"php_filter_validate_email"。
打开./ext/filter/logical_filters.c
void php_filter_validate_email(PHP_INPUT_FILTER_PARAM_DECL) /* {{{ */
{
/*
* The regex below is based on a regex by Michael Rushton.
* However, it is not identical. I changed it to only consider routeable
* addresses as valid. Michael's regex considers a@b a valid address
* which conflicts with section 2.3.5 of RFC 5321 which states that:
*
* Only resolvable, fully-qualified domain names (FQDNs) are permitted
* when domain names are used in SMTP. In other words, names that can
* be resolved to MX RRs or address (i.e., A or AAAA) RRs (as discussed
* in Section 5) are permitted, as are CNAME RRs whose targets can be
* resolved, in turn, to MX or address RRs. Local nicknames or
* unqualified names MUST NOT be used.
*
* This regex does not handle comments and folding whitespace. While
* this is technically valid in an email address, these parts aren't
* actually part of the address itself.
*
* Michael's regex carries this copyright:
*
* Copyright © Michael Rushton 2009-10
* http://squiloople.com/
* Feel free to use and redistribute this code. But please keep this copyright notice.
*
*/
const char regexp[] = "/^(?!(?:(?:\\x22?\\x5C[\\x00-\\x7E]\\x22?)|(?:\\x22?[^\\x5C\\x22]\\x22?)){255,})(?!(?:(?:\\x22?\\x5C[\\x00-\\x7E]\\x22?)|(?:\\x22?[^\\x5C\\x22]\\x22?)){65,}@)(?:(?:[\\x21\\x23-\\x27\\x2A\\x2B\\x2D\\x2F-\\x39\\x3D\\x3F\\x5E-\\x7E]+)|(?:\\x22(?:[\\x01-\\x08\\x0B\\x0C\\x0E-\\x1F\\x21\\x23-\\x5B\\x5D-\\x7F]|(?:\\x5C[\\x00-\\x7F]))*\\x22))(?:\\.(?:(?:[\\x21\\x23-\\x27\\x2A\\x2B\\x2D\\x2F-\\x39\\x3D\\x3F\\x5E-\\x7E]+)|(?:\\x22(?:[\\x01-\\x08\\x0B\\x0C\\x0E-\\x1F\\x21\\x23-\\x5B\\x5D-\\x7F]|(?:\\x5C[\\x00-\\x7F]))*\\x22)))*@(?:(?:(?!.*[^.]{64,})(?:(?:(?:xn--)?[a-z0-9]+(?:-+[a-z0-9]+)*\\.){1,126}){1,}(?:(?:[a-z][a-z0-9]*)|(?:(?:xn--)[a-z0-9]+))(?:-+[a-z0-9]+)*)|(?:\\[(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){7})|(?:(?!(?:.*[a-f0-9][:\\]]){7,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?)))|(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){5}:)|(?:(?!(?:.*[a-f0-9]:){5,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3}:)?)))?(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))(?:\\.(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))){3}))\\]))$/iD";
pcre *re = NULL;
pcre_extra *pcre_extra = NULL;
int preg_options = 0;
int ovector[150]; /* Needs to be a multiple of 3 */
int matches;
/* The maximum length of an e-mail address is 320 octets, per RFC 2821. */
if (Z_STRLEN_P(value) > 320) {
RETURN_VALIDATION_FAILED
}
re = pcre_get_compiled_regex((char *)regexp, &pcre_extra, &preg_options TSRMLS_CC);
if (!re) {
RETURN_VALIDATION_FAILED
}
matches = pcre_exec(re, NULL, Z_STRVAL_P(value), Z_STRLEN_P(value), 0, 0, ovector, 3);
/* 0 means that the vector is too small to hold all the captured substring offsets */
if (matches < 0) {
RETURN_VALIDATION_FAILED
}
}
从代码中可以看出,首先检查输入的字符长度不能超过320个。然后就用正则表达式匹配了。
这里的正则表达式好像用了ASCII码16进制,看不懂。
都是正则表达式应该是通用的编写个测试脚本试试。
$expr = "/^(?!(?:(?:\\x22?\\x5C[\\x00-\\x7E]\\x22?)|(?:\\x22?[^\\x5C\\x22]\\x22?)){255,})(?!(?:(?:\\x22?\\x5C[\\x00-\\x7E]\\x22?)|(?:\\x22?[^\\x5C\\x22]\\x22?)){65,}@)(?:(?:[\\x21\\x23-\\x27\\x2A\\x2B\\x2D\\x2F-\\x39\\x3D\\x3F\\x5E-\\x7E]+)|(?:\\x22(?:[\\x01-\\x08\\x0B\\x0C\\x0E-\\x1F\\x21\\x23-\\x5B\\x5D-\\x7F]|(?:\\x5C[\\x00-\\x7F]))*\\x22))(?:\\.(?:(?:[\\x21\\x23-\\x27\\x2A\\x2B\\x2D\\x2F-\\x39\\x3D\\x3F\\x5E-\\x7E]+)|(?:\\x22(?:[\\x01-\\x08\\x0B\\x0C\\x0E-\\x1F\\x21\\x23-\\x5B\\x5D-\\x7F]|(?:\\x5C[\\x00-\\x7F]))*\\x22)))*@(?:(?:(?!.*[^.]{64,})(?:(?:(?:xn--)?[a-z0-9]+(?:-+[a-z0-9]+)*\\.){1,126}){1,}(?:(?:[a-z][a-z0-9]*)|(?:(?:xn--)[a-z0-9]+))(?:-+[a-z0-9]+)*)|(?:\\[(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){7})|(?:(?!(?:.*[a-f0-9][:\\]]){7,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?)))|(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){5}:)|(?:(?!(?:.*[a-f0-9]:){5,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3}:)?)))?(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))(?:\\.(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))){3}))\\]))$/iD";
if(preg_match($expr,"[email protected]")){
echo "Matched\n";
}else{
echo "Invalided\n";
}
运行结果:
将[email protected]改成sw@opensourcecom,看看结果:
需要进行更多的验证。