erlang cowboy 在 nginx 499 时的 handler process shutdown

简述

erlang 的 cowboy 是一个 web server 框架。它在客户端提前断开(nginx http code 499)时,会直接杀掉handler进程。这很容易造成bug。

示例代码

参考 https://ninenines.eu/docs/en/...
有handler代码如下:

-module(hello_handler).
-behavior(cowboy_handler).

-export([init/2]).

init(Req, State) ->
      erlang:display("before_sleep"),
      timer:sleep(3000),
      erlang:display("after_sleep"),
    Req = cowboy_req:reply(
        200,
        #{<<"content-type">> => <<"text/plain">>},
        <<"Hello Erlang!">>,
        Req
    ),
    {ok, Req, State}.

curl http://localhost:8080

时,有输出:

([email protected])1> "before_sleep"
"after_sleep"

如果

curl http://localhost:8080 --max-time 0.001
curl: (28) Resolving timed out after 4 milliseconds

有输出:

([email protected])1> "before_sleep"

这个说明handler进程的执行被抢行掐断了。如果代码中有对进程外部资源的访问,比如加锁,显然会造成锁释放问题。

问题原因

见 cowboy_http.erl:loop

loop(State=#state{parent=Parent, socket=Socket, transport=Transport, opts=Opts,
        buffer=Buffer, timer=TimerRef, children=Children, in_streamid=InStreamID,
        last_streamid=LastStreamID}) ->
    Messages = Transport:messages(),
    InactivityTimeout = maps:get(inactivity_timeout, Opts, 300000),
    receive
        %% Discard data coming in after the last request
        %% we want to process was received fully.
        {OK, Socket, _} when OK =:= element(1, Messages), InStreamID > LastStreamID ->
            loop(State);
        %% Socket messages.
        {OK, Socket, Data} when OK =:= element(1, Messages) ->
            parse(<< Buffer/binary, Data/binary >>, State);
        {Closed, Socket} when Closed =:= element(2, Messages) ->
            terminate(State, {socket_error, closed, 'The socket has been closed.'});
        {Error, Socket, Reason} when Error =:= element(3, Messages) ->
            terminate(State, {socket_error, Reason, 'An error has occurred on the socket.'});
        {Passive, Socket} when Passive =:= element(4, Messages);
                %% Hardcoded for compatibility with Ranch 1.x.
                Passive =:= tcp_passive; Passive =:= ssl_passive ->
            setopts_active(State),
            loop(State);
        %% Timeouts.

最终会通过发送exit消息方式,杀掉children进程。

-spec terminate(children()) -> ok.
terminate(Children) ->
    %% For each child, either ask for it to shut down,
    %% or cancel its shutdown timer if it already is.
    %%
    %% We do not need to flush stray timeout messages out because
    %% we are either terminating or switching protocols,
    %% and in the latter case we flush all messages.
    _ = [case TRef of
        undefined -> exit(Pid, shutdown);
        _ -> erlang:cancel_timer(TRef, [{async, true}, {info, false}])
    end || #child{pid=Pid, timer=TRef} <- Children],
    before_terminate_loop(Children).

因为children没有trap exit,在没有任何日志输出,任何机会处理的情况下退出了。

总结

因为cowboy在对端断开时,会直接杀掉handler进程,这个很容易造成bug。可以使用nginx的 proxy_ignore_client_abort on。让客户端断开不传递至后端,从而规避这个问题。

你可能感兴趣的:(erlang)