【故障】nginx间隙性出现502 错误

背景

部署了一台nginx ,访问的时候,有的时候会出现请求不到。如下图:
【故障】nginx间隙性出现502 错误_第1张图片
nginx 的配置:


	#user  nobody;
	worker_processes  1;

	#error_log  logs/error.log;
	#error_log  logs/error.log  notice;
	#error_log  logs/error.log  info;

	#pid        logs/nginx.pid;


	events {
		worker_connections  1024;
	}


	http {
		include       mime.types;
		default_type  application/octet-stream;

		#log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
		#                  '$status $body_bytes_sent "$http_referer" '
		#                  '"$http_user_agent" "$http_x_forwarded_for"';

		#access_log  logs/access.log  main;

		sendfile        on;
		#tcp_nopush     on;

		#keepalive_timeout  0;
		keepalive_timeout  65;

		#gzip  on;

		server {
			listen       8091;
			server_name  localhost;

			#charset koi8-r;

			#access_log  logs/host.access.log  main;
			location / {
				proxy_connect_timeout 300s;
				proxy_send_timeout 300s;
				proxy_read_timeout 300s;
				proxy_set_header            Host $host:$server_port;
				proxy_set_header            X-real-ip $remote_addr;
				proxy_set_header            X-Forwarded-For $proxy_add_x_forwarded_for;
				proxy_pass http://192.168.1.61:9199/;
			}

			error_page   500 502 503 504  /50x.html;
			location = /50x.html {
				root   html;
			}
	}

在测试环境访问,相同的配置没有问题,但是放到正式环境就会出现间隙性访问502的错误。

问题定位

查看nginx 错误日志。

2020/08/07 09:29:22 [crit] 12236#11116: *457 connect() to 172.18.44.6:8091 failed (10055: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full) while connecting to upstream, client: 183.17.229.173, server: localhost, request: "GET / HTTP/1.1", upstream: "http://172.18.44.6:8091/", host: "183.207.215.123:8091"
2020/08/07 09:29:22 [crit] 12236#11116: *458 connect() to 172.18.44.6:8091 failed (10055: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full) while connecting to upstream, client: 183.17.229.173, server: localhost, request: "GET / HTTP/1.1", upstream: "http://172.18.44.6:8091/", host: "183.207.215.123:8091"
2020/08/07 09:29:22 [crit] 12236#11116: *460 connect() to 172.18.44.6:8091 failed (10055: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full) while connecting to upstream, client: 183.17.229.173, server: localhost, request: "GET / HTTP/1.1", upstream: "http://172.18.44.6:8091/", host: "183.207.215.123:8091"
2020/08/07 09:29:22 [crit] 12236#11116: *462 connect() to 172.18.44.6:8091 failed (10055: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full) while connecting to upstream, client: 183.17.229.173, server: localhost, request: "GET / HTTP/1.1", upstream: "http://172.18.44.6:8091/", host: "183.207.215.123:8091"
2020/08/07 09:29:22 [crit] 12236#11116: *464 connect() to 172.18.44.6:8091 failed (10055: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full) while connecting to upstream, client: 183.17.229.173, server: localhost, request: "GET / HTTP/1.1", upstream: "http://172.18.44.6:8091/", host: "183.207.215.123:8091"
2020/08/07 09:29:22 [crit] 12236#11116: *466 connect() to 172.18.44.6:8091 failed (10055: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full) while connecting to upstream, client: 183.17.229.173, server: localhost, request: "GET / HTTP/1.1", upstream: "http://172.18.44.6:8091/", host: "183.207.215.123:8091"
2020/08/07 09:29:22 [crit] 12236#11116: *468 connect() to 172.18.44.6:8091 failed (10055: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full) while connecting to upstream, client: 183.17.229.173, server: localhost, request: "GET / HTTP/1.1", upstream: "http://172.18.44.6:8091/", host: "183.207.215.123:8091"
2020/08/07 09:29:22 [crit] 12236#11116: *470 connect() to 172.18.44.6:8091 failed (10055: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full) while connecting to upstream, client: 183.17.229.173, server: localhost, request: "GET / HTTP/1.1", upstream: "http://172.18.44.6:8091/", host: "183.207.215.123:8091"
2020/08/07 09:29:22 [crit] 12236#11116: *472 connect() to 172.18.44.6:8091 failed (10055: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full) while connecting to upstream, client: 183.17.229.173, server: localhost, request: "GET / HTTP/1.1", upstream: "http://172.18.44.6:8091/", host: "183.207.215.123:8091"
2020/08/07 09:29:22 [crit] 12236#11116: *474 connect() to 172.18.44.6:8091 failed (10055: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full) while connecting to upstream, client: 183.17.229.173, server: localhost, request: "GET / HTTP/1.1", upstream: "http://172.18.44.6:8091/", host: "183.207.215.123:8091"
2020/08/07 09:29:22 [crit] 12236#11116: *476 connect() to 172.18.44.6:8091 failed (10055: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full) while connecting to upstream, client: 183.17.229.173, server: localhost, request: "GET / HTTP/1.1", upstream: "http://172.18.44.6:8091/", host: "183.207.215.123:8091"
2020/08/07 09:29:22 [crit] 12236#11116: *478 connect() to 172.18.44.6:8091 failed (10055: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full) while connecting to upstream, client: 183.17.229.173, server: localhost, request: "GET / HTTP/1.1", upstream: "http://172.18.44.6:8091/", host: "183.207.215.123:8091"
2020/08/07 09:29:22 [crit] 12236#11116: *480 connect() to 172.18.44.6:8091 failed (10055: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full) while connecting to upstream, client: 183.17.229.173, server: localhost, request: "GET / HTTP/1.1", upstream: "http://172.18.44.6:8091/", host: "183.207.215.123:8091"
2020/08/07 09:29:22 [crit] 12236#11116: *482 connect() to 172.18.44.6:8091 failed (10055: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full) while connecting to upstream, client: 183.17.229.173, server: localhost, request: "GET / HTTP/1.1", upstream: "http://172.18.44.6:8091/", host: "183.207.215.123:8091"
2020/08/07 09:29:22 [crit] 12236#11116: *484 connect() to 172.18.44.6:8091 failed (10055: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full) while connecting to upstream, client: 183.17.229.173, server: localhost, request: "GET / HTTP/1.1", upstream: "http://172.18.44.6:8091/", host: "183.207.215.123:8091"
2020/08/07 09:29:22 [crit] 12236#11116: *486 connect() to 172.18.44.6:8091 failed (10055: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full) while connecting to upstream, client: 183.17.229.173, server: localhost, request: "GET / HTTP/1.1", upstream: "http://172.18.44.6:8091/", host: "183.207.215.123:8091"
2020/08/07 09:29:22 [crit] 12236#11116: *488 connect() to 172.18.44.6:8091 failed (10055: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full) while connecting to upstream, client: 183.17.229.173, server: localhost, request: "GET / HTTP/1.1", upstream: "http://172.18.44.6:8091/", host: "183.207.215.123:8091"
2020/08/07 09:29:22 [crit] 12236#11116: *490 connect() to 172.18.44.6:8091 failed (10055: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full) while connecting to upstream, client: 183.17.229.173, server: localhost, request: "GET / HTTP/1.1", upstream: "http://172.18.44.6:8091/", host: "183.207.215.123:8091"
2020/08/07 09:29:22 [crit] 12236#11116: *494 connect() to 172.18.44.6:8091 failed (10055: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full) while connecting to upstream, client: 183.17.229.173, server: localhost, request: "GET / HTTP/1.1", upstream: "http://172.18.44.6:8091/", host: "183.207.215.123:8091"
2020/08/07 09:29:22 [crit] 12236#11116: *495 connect() to 172.18.44.6:8091 failed (10055: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full) while connecting to upstream, client: 183.17.229.173, server: localhost, request: "GET / HTTP/1.1", upstream: "http://172.18.44.6:8091/", host: "183.207.215.123:8091"
2020/08/07 09:29:22 [crit] 12236#11116: *497 connect() to 172.18.44.6:8091 failed (10055: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full) while connecting to upstream, client: 183.17.229.173, server: localhost, request: "GET / HTTP/1.1", upstream: "http://172.18.44.6:8091/", host: "183.207.215.123:8091"
2020/08/07 09:29:59 [crit] 12236#11116: *490 connect() to 172.18.44.6:8091 failed (10055: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full) while connecting to upstream, client: 183.17.229.173, server: localhost, request: "GET / HTTP/1.1", upstream: "http://172.18.44.6:8091/", host: "183.207.215.123:8091"

都是报错:

10055: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full

翻译过来的意思就是:套接字上的操作无法执行,因为系统缺少足够的缓冲区空间或队列已满。
意思是说缓存满了,这个更像是服务器的问题。
网上百度了一圈,也没有找到很好的解决方案。

查看端口也没有很多的等待队列。

netstat -ano |findstr "8091"

但是既然是报这种错,系统缺少足够的缓冲区空间。就先从这个错误入手。
在nginx 中设置代理的缓存。

proxy_buffering    on;          #开启从后端被代理服务器的响应内容缓冲
proxy_buffer_size  4k;          #设置从后端被代理服务器的响应内容缓冲区大小
proxy_buffers    8 1M;         #设置从被代理的后端服务器取得的响应内容缓冲区的大小和数量
proxy_busy_buffers_size  2M;       #高负荷下缓冲大小(proxy_buffers*2)

解决方案

暂时的解决方案,在nginx.conf 中增加代理缓存的配置。

worker_processes  1;

events {
    worker_connections  1024;
}
	
http {
    include       mime.types;
    default_type  application/octet-stream;

    sendfile        on;
    keepalive_timeout  65;

	upstream http_backend {
	server 172.18.44.6:8091;
	keepalive 1024;
	}

    server {
        listen       8091;
        server_name  localhost;
		
		location /favicon.ico {
		return 200;
		access_log off;
		}

        #charset koi8-r;

        #access_log  logs/host.access.log  main;
		location / {
			proxy_pass http://http_backend;
			proxy_http_version 1.1;
			proxy_set_header Connection "";
            proxy_connect_timeout 300s;
            proxy_send_timeout 300s;
            proxy_read_timeout 300s;
			proxy_buffering		on;
			proxy_buffer_size	4k;
			proxy_buffers		8	1M;
			proxy_busy_buffers_size	2M;
			proxy_max_temp_file_size	0;
            proxy_set_header            Host $host:$server_port;
            proxy_set_header            X-real-ip $remote_addr;
            proxy_set_header            X-Forwarded-For $proxy_add_x_forwarded_for;
        }
}

效果,我试了一下,依然会存在502 的情况,但是相对修改之前少了很多。应该从服务器的方面进行排查。但是这方面经验不足啊,只能重启一波看下了。

你可能感兴趣的:(Nginx)