Test::Nginx使用说明

1. 简介

1.1 组件介绍

Test::Nginx是一个由Perl编写的nginx测试组件，继承自Test::Base。

1.2 安装方法

sudo cpan Test::Nginx

1.3 测试目录结构

测试目录最好命名为t，因为在测试过程中会在t中创建conf目录存储当前测试的配置和日志（默认路径为t/），应该可以提供参数来指定目录，尚未研究。

cc_cache/
├── go.sh
└── t
    ├── 001-test.t
    ├── 002-strategy.t
    ├── 003-rule.t
    └── ANYU.pm

1.4 测试文件格式

Test::Nginx遵循一种特殊的设计。

#codes...

__DATA__

#suites...

如代码片段所示，使用DATA将每个测试文件分割成代码和测试用例两部分，Perl解释器将只运行代码部分内容。

__DATA__
其实这是Perl相关的语法，在代码中使用可以从标准输入接收数据，类似地使用文件句柄可以直接从脚本自身获取数据（脚本中特殊常量DATA之后的内容），同时它也标志着脚本的逻辑代码的结束。
注意：第二次使用句柄时，需要修改偏移量才能重新从头读取数据。

1.4.1 代码部分

# 调用对应的测试模块，一般情况使用'no plan'即可
use Test::Nginx::Socket 'no plan';
# 运行测试用例
run_tests();

1.4.2 用例部分

由一系列测试块(BLOCK)组成，每个测试块由标题、可选描述和一组用例组成。

# 标题名称，合理标题前缀有利于测试结果分析
# 可以使用reindex命令自动生成TEST编号前缀
=== title TEST01: NAME
optional description

# value1包含换行符在内多行数据
# 可以使用多个filter对value1进行预处理
--- section1 (filter ...)
value1

# value2默认使用了chmop去掉了两边的空白符
--- section2 : value2

常用sections举例。

=== TEST 1: hello, world
This is just a simple demonstration

# 增加配置到server{}
--- config
location = /t {
    echo "hello, world!";
}

# 发起测试请求
--- request
GET /t

# 确认响应body
# chomp用来去掉value两边的空白符
--- response_body chomp
hello, world!

# 确认响应状态码，不显示指定该section时默认200
--- error_code: 200

# 仅运行此block
--- ONLY

# 跳过此block
--- SKIP

# eval将value作为代码执行并获取对应结果
-- response_body eval
"a" x 4096

1.4.3 更多Sections

Section	说明	备注
config	插入server块配置
http_config	插入http块配置
main_config	插入顶层配置
request	发起请求	默认HTTP/1.1，可以后缀指定
pipelined_requests	多个请求	request好像会自动转换
response_body	响应body	可以用eval支持qr//格式的正则
response_body_like	响应正则	response_body好像会自动转换
error_code	响应码	默认200
more_headers	请求头
response_headers	响应头检查
error_log	错误日志包含
no_error_log	错误日志不包含	使用[error]确定没异常
grep_error_log	筛选日志条件	结果与grep_error_log_out匹配
grep_error_log_out	匹配筛选日志	类res可以是多行内容
wait	日志等待时间	单位秒，支持float
must_die	测试启动失败	标记，启动失败则测试通过
ONLY	只测试当前block	标记
SKIP	跳过当前block	标记
ignore_response	不检查响应完整性	默认HTTP/1.1协议格式
timeout	连接超时时间	默认3秒

测试时将为每个block生成一个配置文件t/servroot/conf/nginx.conf进行测试。
我们还可以通过编写perl代码的方式来新增section或扩展现有的section。

1.5 通用代码封装

通常使用prove来启动每个测试perl脚本来获取测试结果，我们可以将通用的测试启动参数、测试配置和函数封装调用。

-- 封装模块 --
package t::ANYUTester;

# 启动HUP测试模式
BEGIN {
    if ($ENV{TEST_NGINX_CHECK_LEAK}) {
        $SkipReason = "unavailable for the hup tests";
    } else {
        $ENV{TEST_NGINX_USE_HUP} = 1;
        undef $ENV{TEST_NGINX_USE_STAP};
    }
}

# -Base部分在调用t::ANYUTester时传入
use Test::Nginx::Socket -Base;

repeat_each(2);
log_level('info');

# 控制block测试顺序与代码一致（默认是随机测试）
no_shuffle();

# 使用diff风格的字符串对比(response_body/grep_error_log_out等)
no_long_string();

# 在每个block启动前调用进行预处理。
add_block_preprocessor(sub {
    my ($block) = @_;

    # 修改http部分配置
    my $http_config = $block->http_config;
    # EOCH为perl的多行字符串语法
    $http_config .= <<'_EOCH_';
    lua_package_path "/home/work/cc/cc_nginx/luafiles/?.lua;;";
    lua_shared_dict cache_status 100m;
    lua_shared_dict cache_status_lock 100k;
_EOCH_
    $block->set_value("http_config", $http_config);

    # 修改server部分配置
    my $config = $block->config;
    $config .= <<'_EOCS_';
    location /cc_cache {
        rewrite_by_lua_file /home/work/cc/cc_nginx/luafiles/cc/api/cache.lua;
    }

    location /rule_check {
        rewrite_by_lua_block {
            local host = ngx.var.arg_host
            local module = ngx.var.arg_module
            local uri = ngx.var.arg_uri
            local infos = {
                host = host,
                uri = uri,
            }

            local rule_api = require "cc.rule.cache"
            local strategy_api = require "cc.strategy.cache"

            local key = module .. ":" .. host
            local rules = strategy_api.lru_get(key)
            if rules then
                ngx.print(rule_api.match(infos, rules))
            else
                ngx.print("miss")
            end
        }
    }
_EOCS_
    $block->set_value("config", $config);

});

1;
__END__

-- 主模块 --
use t::ANYUTester 'no_plan';
run_tests();

__DATA__
# test suites...

1.6 测试启动器

可以将系统环境变量等启动时需要设置的内容进行封装，避免环境污染，方便操作。

#!/usr/bin/env bash

export PATH=/home/work/cc/cc_nginx/nginx/sbin:$PATH

# 部分系统perl环境会提示(Can't locate t/.pm in @INC)
export PERL5LIB=$(pwd):$PERL5LIB

exec prove "$@"

使用shell命令$ sh go.sh t/进行测试，可以指定某个.t文件进行单独测试。

-v 查看每个测试点的情况
-r 遍历指定的目录

如果使用vim编辑器进行测试用例.t文件的编写，可以直接使用:!prove %快速测试查看结果。

2. Test:Nginx进阶

2.1 知识补充

2.1.1 Perl变量的引用

# codes...
our $after_10_min = time()+600;
run_tests();
# datas...
'prefix "ttl":' . $::after_10_min . 'suffix'
# datas...

只有在支持eval(作为代码通过perl解释器运行)的section(request/response_body)才支持变量引入，因为eval和run_tests前的代码部分运行很可能不在一个模块内(变量的引入可能在block处理时完成)，所以需要使用our定义为包全局变量。

Perl中our和my的区别
my声明的是真正的词法变量，只能在闭合块中访问。
our声明的是包全局变量，在符号表中存储（可以通过全限定在任何地方访问）。
ps. $::var即$main::var，取包全局变量

变量的引用严格使用perl语法，可以通过.连接字符串或在""包闭的串中直接引用。

注意：标准json中是用的双引号"data"，不支持'data'

2.1.2 多种lua流程模块

init_by_lua
nginx启动时的配置加载阶段
init_worker_by_lua
worker进程启动时的配置加载阶段
set_by_lua
用于简单的计算设置变量，适用于server/location
rewrite_by_lua
在标准HtpRewriteModule之后进行，适用于http/server/location
access_by_lua
请求访问阶段，在标准HttpAccessModule之后进行，用于访问控制，适用于http/server/location
content_by_lua
内容处理器，接收请求并输出响应，适用于location
header_filter_by_lua
body_filter_by_lua
log_by_lua

参考链接：https://www.cnblogs.com/JohnABC/p/6206622.html

2.1.3 Response为table时的比较

通过在Json请求中增加了一个变量来将预定的结果table传入，然后调用table序列化函数（有序模式），将实际结果和预期结果都转为字符串进行比较。

if data.check_result then
    local check_result = data.check_result
    if is_table(check_result) then
        -- table2string有序将table内数据转为统一格式的字符串
        check_result = util.table2string(check_result, true)
        result = util.table2string(json_decode(result), true)
    else
        check_result = tostring(check_result)
    end

    ngx_log(ngx_ERR, "result: " .. result)
    ngx_log(ngx_ERR, "check_result: " .. check_result)

    if result == check_result then
        ngx_exit(200, 'true')
    else
        ngx_exit(400, 'false')
    end
else
    ngx_exit(200, result)
end

2.1.4 跳过某个.t文件

use Test::Nginx::Socket skip_all => "some reasons";

2.2 超时测试技巧

2.2.1 连接超时

可以通过在某个IP的服务器链路上的防火墙设置相应的规则，丢弃所有的SYN包，就会产生访问超时的现象。下面是官方示例的代码和相应的"黑洞"地址。

=== TEST 1: connect timeout
--- config
    # 配置DNS解析使用的服务器IP
    resolver 8.8.8.8;
    resolver_timeout 1s;

    location = /t {
        content_by_lua_block {
            local sock = ngx.socket.tcp()
            sock:settimeout(100) -- ms
            local ok, err = sock:connect("agentzh.org", 12345)
            if not ok then
                ngx.log(ngx.ERR, "failed to connect: ", err)
                return ngx.exit(500)
            end
            ngx.say("ok")
        }
    }
--- request
GET /t
--- response_body_like: 500 Internal Server Error
--- error_code: 500
--- error_log
failed to connect: timeout

一般情况下在测试中需要把timeout时间相应的缩小，有利于加快测试进度。需要注意的是，Test::Nginx默认的连接超时时间是3秒。

2.2.2 读取超时

可以通过修改server代码，使用sleep等操作长时间不中断连接即可。

=== TEST 1: read timeout
--- main_config
    # TCP伪造服务器
    stream {
        server {
            listen 5678;
            content_by_lua_block {
                ngx.sleep(10)  -- 10 sec
            }
        }
    }
--- config
    lua_socket_log_errors off;
    location = /t {
        content_by_lua_block {
            local sock = ngx.socket.tcp()
            sock:settimeout(100) -- ms
            assert(sock:connect("127.0.0.1", 5678))
            ngx.say("connected.")
            local data, err = sock:receive()  -- try to read a line
            if not data then
                ngx.say("failed to receive: ", err)
            else
                ngx.say("received: ", data)
            end
        }
    }
--- request
GET /t
--- response_body
connected.
failed to receive: timeout
--- no_error_log
[error]

2.2.3 发送超时

使用mockeagain配合lua测试代码完成。
参考链接：https://openresty.gitbooks.io/programming-openresty/content/testing/testing-erroneous-cases.html

2.2.4 Client中断

使用timeout配合abort处理，在server中使用sleep延长响应时间，造成客户端中断，观察server日志确定测试结果。

=== TEST 1: abort processing in the Lua callback on client aborts
--- config
    location = /t {
        # 检查客户端中止事件
        lua_check_client_abort on;

        content_by_lua_block {
            # 设置回调函数在Client中断时停止客户端处理
            local ok, err = ngx.on_abort(function ()
                ngx.log(ngx.NOTICE, "on abort handler called!")
                ngx.exit(444)
            end)

            if not ok then
                error("cannot set on_abort: " .. err)
            end

            ngx.sleep(0.7)  -- sec
            ngx.log(ngx.NOTICE, "main handler done")
        }
    }
--- request
    GET /t
--- timeout: 0.2
# 控制Test::Nginx忽略超时错误信息
--- abort
--- ignore_response
--- no_error_log
[error]
main handler done
--- error_log
client prematurely closed connection
on abort handler called!

2.3 丰富的测试模式

2.3.1 Benchmark模式

在运行prove前设定系统变量TEST_NGINX_BENCHMARK即可开启基准测试。如果使用HTTP/1.1协议，将调用weighttp工具，如果使用HTTP/1.0将会调用ab工具，这些工具需要自己安装的哈。

# 将会发起2000个请求，并发数为2
# weighttp -c2 -k -n2000 $url
# ab -r -d -S -c2 -k -n2000 $url
export TEST_NGINX_BENCHMARK='2000 2'
prove t/foo.t

测试结果中可以展示出测试结果供参考。

---weighttp---
finished in 2 sec, 652 millisec and 752 microsec, 75393 req/s, 12218 kbyte/s
requests: 200000 total, 200000 started, 200000 done, 200000 succeeded, 0 failed, 0 errored
status codes: 200000 2xx, 0 3xx, 0 4xx, 0 5xx

---ab---
Time taken for tests:   3.001 seconds
Complete requests:      200000
Failed requests:        0
Requests per second:    66633.75 [#/sec] (mean)
Transfer rate:          10798.70 [Kbytes/sec] received

在run_tests()前加入以下命令，可以开启多进程测试。

master_on();
workers(4);

2.3.2 HUP模式

为了保证测试块间的环境隔离，Test::Nginx默认在每个测试块启动独立的Nginx实例。当我们需要在测试块间共享一个实例时（保持共享内存），就需要在测试块之间使用HUP信号的方式切换。

# Blocks间使用HUP方式重启
export TEST_NGINX_USE_HUP=1

# Files间也使用HUP方式重启
export TEST_NGINX_NO_CLEAN=1

也可以在测试文件.t中的代码启动部分使用perl进行处理。

# 使用HUP方式进行BLOCK间的切换测试，共享内存数据等。
BEGIN {
    # 使用weghttpd/ab+ps进行简单内存泄漏测试
    if ($ENV{TEST_NGINX_CHECK_LEAK}) {
        $SkipReason = "unavailable for the hup tests";
    } else {
        # 使用HUP方式在block间切换nginx，但是repeat_each时好像不是
        $ENV{TEST_NGINX_USE_HUP} = 1;
        undef $ENV{TEST_NGINX_USE_STAP};
    }
}

use Test::Nginx::Socket 'no_plan';

# test codes and suites

如果想要在Block内的测试中重新加载openresty，只能通过lua代码来控制了。

local f = assert(io.open("t/servroot/logs/nginx.pid", "r"))
local master_pid = assert(f:read())
assert(f:close())
assert(os.execute("kill -HUP " .. master_pid) == 0)

Nginx的几种信号说明。

kill -HUP nginx进程号("/var/run/nginx.pid")
当nginx接收到HUP信号时，它会尝试先解析配置文件（如果指定文件，就使用指定的，否则使用默认的），如果成功，就应用新的配置文件（例如：重新打开日志文件或监听的套接字），之后，nginx运行新的工作进程并从容关闭旧的工作进程，通知工作进程关闭监听套接字，但是继续为当前连接的客户提供服务，所有客户端的服务完成后，旧的工作进程就关闭，如果新的配置文件应用失败，nginx再继续使用早的配置进行工作。

TERM,INT 快速关闭，关闭信号源（stop）
QUIT 从容关闭，会等待请求结束（quit）
HUP 平滑重启，重新加载配置文件（使用reload会加载两次配置）
USR1 重新打开日志文件，切割日志时用途较大
USR2 平滑升级可执行程序（reopen）
WINCH 从容关闭工作进程（配合USR2进行版本升级）

当然，使用HUP加载的方式会带来一些影响，比如HUP加载会有一个新旧worker交替时间，导致某些情况下测试结果与预期不符。

2.3.3 Valgrind模式

openresty中内存相关问题一般出现在较底层的代码（Nginx/LuaJIT/FFI），Lua代码的内存问题（无限制得对全局表和变量操作）无法使用Valgrind检查，因为它们被LuaJIT垃圾收集机制管理。

export TEST_NGINX_USE_VALGRIND=1

贴一个官方的测试用栗。

=== TEST 1: C strlen()
--- config
    location = /t {
        content_by_lua_block {
            local ffi = require "ffi"
            local C = ffi.C

            # 避免每次都重新声明这个函数，使用pcall进行验证
            if not pcall(function () return C.strlen end) then
                ffi.cdef[[
                    size_t strlen(const char *s);
                ]]
            end

            local buf = ffi.new("char[3]", {48, 49, 0})
            local len = tonumber(C.strlen(buf))
            ngx.say("strlen: ", len)
        }
    }
--- request
    GET /t
--- response_body
strlen: 2
--- no_error_log
[error]

在直接以valgrind模式进行测试时，会产生一些假性的错误告警，如下所示：

t/a.t .. TEST 1: C strlen()
==7366== Invalid read of size 4
==7366==    at 0x546AE31: str_fastcmp (lj_str.c:57)
==7366==    by 0x546AE31: lj_str_new (lj_str.c:166)
==7366==    by 0x547903C: lua_setfield (lj_api.c:903)
==7366==    by 0x4CAD18: ngx_http_lua_cache_store_code (ngx_http_lua_cache.c:119)
==7366==    by 0x4CAB25: ngx_http_lua_cache_loadbuffer (ngx_http_lua_cache.c:187)
==7366==    by 0x4CB61A: ngx_http_lua_content_handler_inline (ngx_http_lua_contentby.c:300)

这是由于LuaJIT的内存优化导致的，在LuaJIT的代码仓库中有一个文件lj.supp列出了所有已知的valgrind会产生的误报，我们可以直接复制到工作目录并重命名为valgrind.suppress供测试模式调用valgrind使用。

cp -i /path/to/luajit-2.0/src/lj.supp ./valgrind.suppress

再次运行测试时，将会收到正确的测试结果。

t/a.t .. TEST 1: C strlen()
t/a.t .. ok
All tests successful.
Files=1, Tests=3,  2 wallclock secs ( 0.01 usr  0.00 sys +  1.51 cusr  0.06 csys =  1.58 CPU)
Result: PASS

在测试过程中还可能会出现nginx内核和openssl库等相关的误报，一般情况下框架会自动识别并在测试结果结尾展示可以添加到valgrind.suppress中的内容，如下所示。

{
   
   Memcheck:Addr4
   fun:str_fastcmp
   fun:lj_str_new
   fun:lua_setfield
   fun:ngx_http_lua_cache_store_code
   fun:ngx_http_lua_cache_loadbuffer
   fun:ngx_http_lua_content_handler_inline
   fun:ngx_http_core_content_phase
   fun:ngx_http_core_run_phases
   fun:ngx_http_process_request
   fun:ngx_http_process_request_line
   fun:ngx_epoll_process_events
   fun:ngx_process_events_and_timers
   fun:ngx_single_process_cycle
   fun:main
}

如果要测试一些由于lua分配的对象导致的内存泄漏情况，我们需要在编译luajit的时候加入参数关闭luajit的内存分配策略。

make CCDEBUG=-g XCFLAGS='-DLUAJIT_USE_VALGRIND -DLUAJIT_USE_SYSMALLOC'

或者编译openresty的时候加入特定参数。

./configure \
    --prefix=/opt/openresty-valgrind \
    --with-luajit-xcflags='-DLUAJIT_USE_VALGRIND -DLUAJIT_USE_SYSMALLOC' \
    --with-debug \
    -j4
make -j4
sudo make install

# 或者openresty提供了一个更为直接的参数来编译关闭内存管理的openresty版本
--with-no-pool-patch

在nginx1.9.13之后的版本，提供了一个宏用来设置实现类似"无池补丁"的效果，但是效果并没有编译时关闭内存管理那么干脆利落。

#define NGX_DEBUG_PALLOC

Valgrind无法识别堆栈和应用级别的内存问题。
Google团队开发的AddressSanitizer也可以用来识别内存泄漏，但各有千秋。

2.3.4 CheckLeak模式

Valgrind非常擅长检测各种内存泄漏和内存无效访问，但在面对应用程序级的垃圾收集器（GC）和内存池等内存管理器中的泄漏方面也受到限制，这在现实中很常见。所以Test::Nginx::Socket中新增了一种测试模式来检测内存问题。

export TEST_NGINX_CHECK_LEAK=1

这种测试模式下，框架将会周期性的使用ps命令来获取当前占用的系统内存大小，计算出一个拟合的斜率k表示内存的增长速度，可以通过k来确认是否有内存泄漏问题出现。通过在Lua中定期调用垃圾收集函数，可以确保k参数的有效性。

collectgarbage()

然而，这种测试模式存在一个很大的缺点，它无法提供有关泄漏（可能）发生的位置的任何详细信息。它所报告的只是数据样本和其他指标，只能验证是否存在泄漏（至少在某种程度上）。

在Wiki中应该还有更多的模式来解决这个问题，等有时间去研究一下。

3. Wiki学习链接

使用说明：https://openresty.gitbooks.io/programming-openresty/content/
详细内容：https://metacpan.org/release/Test-Nginx

Openresty测试框架--Test::Nginx