Nginx - 代理、缓存

Nginx

标签 : nginx


代理

代理服务可简单的分为正向代理和反向代理:

  • 正向代理: 用于代理内部网络对Internet的连接请求(如VPN/NAT),客户端指定代理服务器,并将本来要直接发送给目标Web服务器的HTTP请求先发送到代理服务器上, 然后由代理服务器去访问Web服务器, 并将Web服务器的Response回传给客户端: 

  • 反向代理: 与正向代理相反,如果局域网向Internet提供资源,并让Internet上的其他用户可以访问局域网内资源, 也可以设置一个代理服务器, 它提供的服务就是反向代理. 反向代理服务器接受来自Internet的连接,然后将请求转发给内部网络上的服务器,并将Response回传给Internet上请求连接的客户端: 

  • 总结来说:

    • 正向代理和客户端同属一个阵营,对于目标服务器来说,可将他们看成一个客户端;
    • 反向代理和目标服务器同属一个阵营,对于客户端来说,他们”伪装”成了一个目标服务器.

正向代理

由于使用Nginx做正向代理服务的相对较少, 因此Nginx提供的代理服务本身也比较简单, 提供的指令也不多, 直接由ngx_http_core_module模块支持.


指令

  • resolver
Syntax: resolver address ... [valid=time] [ipv6=on|off];
Default:    —
Context:    http, server, location
  • 1
  • 2
  • 3
- address: DNS服务器IP地址, 如果不指定端口号, 默认使用53; 从1.1.7版本开始, 该指令支持多个IP地址.
- time: 设置数据包在网络中的有效时间.
  • resolver_timeout
Syntax: resolver_timeout time;
Default:    
resolver_timeout 30s;
Context:    http, server, location
  • 1
  • 2
  • 3
  • 4
设置DNS服务器域名解析超时时间.
  • proxy_pass
Syntax: proxy_pass URL;
Default: 
Context:    location, if in location, limit_except
  • 1
  • 2
  • 3
设置被代理服务器的协议和地址,在正向代理中,该指令的配置相对固定: proxy_pass http://$http_host$request_uri;

注: proxy_pass不仅用于正向代理,更主要是应用于反向代理服务,后面还有关于它的详细叙述.


示例

    server {
        listen       8001;

        resolver 192.168.111.9 192.168.111.8 192.168.100.8 192.168.100.9;

        location / {
            proxy_pass http://$http_host$request_uri;
        }

        error_page   500 502 503 504  /50x.html;
        location = /50x.html {
            root   html;
        }
    }
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14

注意: 
1. 在配置正向代理的server块中,不要使用server_name指令,即不能设置虚拟主机名或IP. 
2. Nginx正向代理不支持代理HTTPS站点.


反向代理

反向代理是Nginx最常用且最重要的功能之一,由标准HTTP模块ngx_http_proxy_model支持.同正向代理类似,反向代理一般也单独配置一个server块.


指令

  • proxy_pass
Syntax: proxy_pass URL;
Default: 
Context:    location, if in location, limit_except
  • 1
  • 2
  • 3

同正向代理, 该指令用来配置被代理服务器地址,可以是主机名称/IP地址+端口号等形式:

proxy_pass http://localhost:8000/uri/;
  • 1
  • upstream
Syntax: upstream name { ... }
Default:    —
Context:    http
  • 1
  • 2
  • 3

如果被代理的是一组服务器的话, 可以使用upstream指令配置一组后端服务器.Defines a group of servers. Servers can listen on different por ts. In addition, servers listening on TCP and UNIX-domain sockets can be mixed.

http {
    ## ...

    upstream proxy_servs {
        server 10.45.156.170:80;
        server 10.45.156.171:80;
        server 10.45.156.172:80;
    }

    server {
        location ~* \.(do|jsp|jspx)?$ {
            proxy_pass http://proxy_servs;
        }

        ## ...
    }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17

注意: 对于proxy_pass/server指令后的URL中是否包含URI, Nginx有不同的处理方式: 
1. 如果URL中不包含URI, 则Nginx不会改变原地址的URI; 
2. 如果URL中包含了URI, 则Nginx会使用新的URI 代替 原来的URI.

  • 其他反向代理指令
指令 描述
proxy_pass_request_headers on | off; Indicates whether the header fields of the original request are passed to the proxied server.
proxy_pass_request_body on | off; Indicates whether the original request body is passed to the proxied server.
proxy_set_header field value; Allows redefining or appending fields to the request header passed to the proxied server.
proxy_set_body value; Allows redefining the request body passed to the proxied server.
proxy_hide_header field; The proxy_hide_header directive sets additional fields that will not be passed.
proxy_pass_header field; Permits passing “Date”“Server”“X-Pad” and “X-Accel-…”header fields from a proxied server to a client.
proxy_bind address [transparent] | off; Makes outgoing connections to a proxied server originate from the specified local IP address.
proxy_connect_timeout time; Defines a timeout for establishing a connection with a proxied server.
proxy_read_timeout time; Defines a timeout for reading a response from the proxied server.
proxy_send_timeout time; Sets a timeout for transmitting a request to the proxied server.
proxy_http_version 1.0 | 1.1; Sets the HTTP protocol version for proxying. By default, version 1.0 is used.
proxy_method method; Specifies the HTTP method to use in requests forwarded to the proxied server instead of the method from the client request.
proxy_ignore_client_abort on | off; Determines whether the connection with a proxied server should be closed when a client closes the connection without waiting for a response.
proxy_ignore_headers field ...; Disables processing of certain response header fields from the proxied server.
proxy_redirect default | off | redirect replacement; Sets the text that should be changed in the “Location” and “Refresh” header fields of a proxied server response.
proxy_intercept_errors on | off; Determines whether proxied responses with codes greater than or equal to 300 should be passed to a client or be redirected to nginx for processing with the error_page directive.
proxy_headers_hash_max_size size; Sets the maximum size of hash tables used by the proxy_hide_header and proxy_set_header directives.
proxy_headers_hash_bucket_size size; Sets the bucket size for hash tables used by the proxy_hide_header and proxy_set_header directives.
proxy_next_upstream [flag]; Specifies in which cases a request should be passed to the next server,detail
proxy_ssl_session_reuse on | off; Determines whether SSL sessions can be reused when working with the proxied server.

Proxy-Buffer

Proxy Buffer启用后,Nginx会将被代理的服务器的响应数据异步地传递给客户端:

Nginx首先尽可能地从后端服务器那里接收响应数据放在*Buffer*中,如果在接收过程中发现*Buffer*已经装满,Nginx会将部分接收到的数据临时存放到磁盘的临时文件中.当一次响应数据被万千接收或*Buffer*已经装满时,Nginx开始向客户端传输数据.此时Nginx处于`BUSY`状态.

而当Proxy Buffer关闭时, Nginx只要接收到响应数据就会同步地传递给客户端,不会读取完整响应数据.

指令 描述
proxy_buffering on | off; Enables or disables buffering of responses from the proxied server.
proxy_buffers number size; Sets the number and size of the buffers used for reading a response from the proxied server, for a single connection.
proxy_buffer_size size; Sets the size of the buffer used for reading the first part of the response received from the proxied server.
proxy_busy_buffers_size size; When buffering of responses from the proxied server is enabled, limits the total size of buffers that can be busy sending a response to the client while the response is not yet fully read.
proxy_temp_path path [level1 [level2 [level3]]]; Defines a directory for storing temporary files with data received from proxied servers.
proxy_temp_file_write_size size; Limits the size of data written to a temporary file at a time, when buffering of responses from the proxied server to temporary files is enabled.
proxy_max_temp_file_size size; This directive sets the maximum size of the temporary file.

注意: Proxy Buffer配置是针对每一个请求起作用,而不是全局概念,即每个请求都会按照这些指令来配置各自的Buffer, Nginx不会生成一个公共的Proxy Buffer供代理请求使用.


负载均衡

Nginx反向代理的一个重要用途就是负载均衡:

负载均衡的原理是利用一定的分配策略将网络负载平衡地分摊到网络集群的各个节点, 使得单个重负载任务能够分担到多个单元上并行处理,或使得大量的并发访问数据流量分摊到多个节点上分别处理,从而减少用户的等待响应时间.

在实际应用中, 负载均衡会根据网络的不同层次(一般按照ISO/OSI七层参考模型)进行划分. 现代负载均衡技术主要实现和作用于第四层/第七层,完全独立于网络基础硬件设备; Nginx一般被认为是第七层负载均衡. 
负载均衡算法多种多样: 静态负载均衡算法/动态负载均衡算法.静态负载均衡算法比较简单,主要有一般轮询算法/基于比率的加权轮询算法以及基于优先级的加权轮询算法等.动态负载均衡算法在较复杂的网络环境中适应性更强,表现更好,主要有基于任务量的最少连接优先算法/基于性能的最快响应优先算法/预测算法以及动态性能分配算法等; Nginx实现采用基于优先级的加权轮询算法.


Nginx负载均衡

前在介绍upstream时使用了对所有请求的一般轮询规则的负载均衡, 下面介绍基于优先级的加权轮询规则的负载均衡:

http {
    ## ...

    upstream proxy_servs {
        server 10.45.156.170:80 weight=5;
        server 10.45.156.171:80 weight=2;
        server 10.45.156.172:80;    #默认weight=1
    }

    server {
        location ~* \.(do|jsp|jspx)?$ {
            proxy_pass http://proxy_servs;
        }

        ## ...
    }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17

upstream的服务器组中每个server被赋予了不同的优先级,weight就是轮询策略的”权值”, 其中以10.45.156.170:80优先级最高.


缓存

响应速度是衡量Web应用服务性能优劣的重要指标之一,在动态网站中,除了优化发布的内容本身之外,另一个重要的方法就是把不需要实时更新动态页面输出结果转化成静态页面缓存,进而按照静态网页来访问,提升响应速度.


缓存驱动技术

在Nginx中, 缓存驱动技术有两种:

404驱动

  • 原理:

    Nginx处理客户端请求时,一旦发现请求资源不存在,则会产生404错误,Nginx通过捕获该错误,进一步转向后端服务器请求数据,最后将后端服务器响应数据传回给客户端,同时在本地进行缓存.

  • 配置:

location / {
    root html;
    error_page 404 =200 @send_to_backend;
}

location @send_to_backend {
    internal;
    proxy_pass http://proxy_servs;

    proxy_set_header Accept-Encoding "";
    proxy_store on;
    proxy_store_access user:rw group:rw all:r;
    proxy_temp_path /var/www/tmp;
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14

proxy_store指令是由Nginx-Proxy Store模块提供的简单缓存机制,详见下文介绍.


资源不存在驱动

  • 原理:

    与404驱动大同小异, 该方法时通过location块中的if条件判断直接判断请求资源是否存在, 不存在则直接驱动Nginx与后端服务器通信更新Web缓存.

  • 配置:

location / {
    root html;

    proxy_set_header Accept-Encoding "";
    proxy_store on;
    proxy_store_access user:rw group:rw all:r;
    proxy_temp_path /var/www/tmp;

    if ( !-f $request_filename ){
        proxy_pass http://proxy_servs;
    }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12

!-f判断请求资源是否存,如不存在就proxy_pass给后端服务器生成数据传给客户端,同时Proxy Store缓存.


Nginx缓存

Nginx自身实现了两种缓存机制, Proxy Cache/Proxy Store:

Proxy Cache

Proxy Cache是Nginx自身实现的一个功能完整,性能不错的缓存机制.Nginx服务启动后, 会生成专门的进程对磁盘上的缓存文件进行扫描, 在内存中建立缓存索引, 提高访问效率, 并且还会生成专门的管理进程对磁盘上的缓存文件进行过期判定/更新等方面的管理. Proxy Cache缓存支持任意连接响应数据的缓存, 不仅限于200状态的数据.

与前面介绍过的Proxy Buffer不同:Proxy Buffer实现了后端服务器响应数据的异步传输, 而Proxy Cahce则实现了Nginx对客户端数据请求的快速响应. Nginx在接收到后端服务器响应数据后, 一方面通过Proxy Buffer机制将数据传递给客户端, 另一方面根据Proxy Cahce的配置将这些数据缓存到本地, 当客户端下次访问相同数据时, Nginx直接从本地检索数据返回给客户端, 从而减少与后端服务器的交互时间.

指令 描述
proxy_cache zone | off; Defines a shared memory zone used for caching.
proxy_cache_bypass string ...; Defines conditions under which the response will not be taken from a cache.
proxy_cache_key string; Defines a key for caching.
proxy_cache_lock on | off; When enabled, only one request at a time will be allowed to populate a new cache element identified according to the proxy_cache_key directive by passing a request to a proxied server.
proxy_cache_lock_timeout time; Sets a timeout for proxy_cache_lock.
proxy_cache_min_uses number; Sets the number of requests after which the response will be cached.
proxy_cache_use_stale [stale] Determines in which cases a stale cached response can be used when an error occurs during communication with the proxied server. The directive’s parameters match the parameters of the proxy_next_upstream directive.
proxy_cache_valid [code ...] time; Sets caching time for different response codes.
proxy_cache_path path keys_zone=name:size; Sets the path and other parameters of a cache
proxy_no_cache string ...; Defines conditions under which the response will not be saved to a cache.
http {
    ## ...

    upstream proxy_servs {
        server 10.45.156.170:80;
        server 10.45.156.171:80;
        server 10.45.156.172:80;
    }

    proxy_cache_path /var/www/proxycache levels=1:2 max_size=2m inactive=5m loader_sleep=1m keys_zone=MYPROXYCACHE:10m;
    proxy_temp_path /var/www/tmp;


    server {
        ## ...

        location / {
            proxy_pass http://proxy_servs;
            proxy_cache MYPROXYCACHE;
            proxy_cache_valid 200 302 1h;
            proxy_cache_valid 301 1d;
            proxy_cache_valid any 1m;
        }

        location @send_to_backend {
            proxy_pass http://proxy_servs;
        }
    }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29

注: Proxy Cache依赖于Proxy Buffer.且Proxy Cache没有实现自动清理磁盘上缓存数据的能力, 因此在长时间使用过程中会对服务器存储造成一定的压力.


Proxy Store

Nginx还支持另一种将后端服务器数据缓存到本地的方法Proxy Store, 与Proxy Cache的区别是, 它对来自后端服务器的响应数据, 尤其是静态数据只进行简单的缓存, 且只能缓存200状态码下的响应数据, 不支持缓存过期更新, 内存索引建立等功能, 但支持设置用户/用户组对缓存的访问权限.

指令 描述
proxy_store on | off | string; Enables saving of files to a disk.
proxy_store_access users:permissions ...; Sets access permissions for newly created files and directories.

Memcached缓存

Memcached是一套高性能的基于分布式环境的缓存系统,用于动态Web应用可减轻后台数据服务器的负载, 提高客户端响应速度.Nginx的标准模块ngx_http_memcached_module提供了对Memcached的支持.

指令 描述
memcached_pass address; Sets the memcached server address.
memcached_connect_timeout time; Defines a timeout for establishing a connection with a memcached server.
memcached_read_timeout time; Defines a timeout for reading a response from the memcached server.
memcached_send_timeout time; Sets a timeout for transmitting a request to the memcached server.
memcached_buffer_size size; Sets the size of the buffer used for reading the response received from the memcached server.
memcached_next_upstream status ... Specifies in which cases a request should be passed to the next server.

在配置Nginx使用Memcached时,还需要对Nginx配置的全局变量$memcached_key进行设置.


示例

Nginx首先请求Memcached, 如果缓存没有命中(key为"$uri?$args"), Nginx则proxy_pass给后端服务器响应该请求, 但此时也需要后端服务器配合, 在将数据响应给客户端之后, 需要将响应内容手动写入Memcached, 以供下次直接从Memcached检索数据.

  • nginx.conf
location / {
    set $memcached_key "$uri?$args";
    memcached_pass 127.0.0.1:11211;
    error_page 404 =200 @send_to_backend;
    index  index.html index.htm;
}

location @send_to_backend {
    proxy_pass http://proxy_servs;
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • Java: MemcachedFilter
/**
 * @author jifang.
 * @since 2016/5/21 15:50.
 */
public class MemcachedFilter implements Filter {

    private MemcachedClient memcached;

    private static final int _1MIN = 60;

    @Override
    public void init(FilterConfig filterConfig) throws ServletException {
        try {
            MemcachedClientBuilder builder = new XMemcachedClientBuilder(AddrUtil.getAddresses("127.0.0.1:11211"));
            memcached = builder.build();
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
    }

    @Override
    public void doFilter(ServletRequest req, ServletResponse response, FilterChain chain) throws IOException, ServletException {

        // 对PrintWriter包装
        MemcachedWriter mWriter = new MemcachedWriter(response.getWriter());
        chain.doFilter(req, new MemcachedResponse((HttpServletResponse) response, mWriter));

        HttpServletRequest request = (HttpServletRequest) req;
        String key = request.getRequestURI();

        Enumeration names = request.getParameterNames();
        if (names.hasMoreElements()) {
            String name = names.nextElement();
            StringBuilder sb = new StringBuilder(key)
                    .append("?").append(name).append("=").append(request.getParameter(name));
            while (names.hasMoreElements()) {
                name = names.nextElement();
                sb.append("&").append(name).append("=").append(request.getParameter(name));
            }
            key = sb.toString();
        }

        try {
            String rspContent = mWriter.getRspContent();
            memcached.set(key, _1MIN, rspContent);
        } catch (TimeoutException | InterruptedException | MemcachedException e) {
            throw new RuntimeException(e);
        }
    }

    @Override
    public void destroy() {
    }


    private static class MemcachedWriter extends PrintWriter {

        private StringBuilder sb = new StringBuilder();

        private PrintWriter writer;

        public MemcachedWriter(PrintWriter out) {
            super(out);
            this.writer = out;
        }

        @Override
        public void print(String s) {
            sb.append(s);
            this.writer.print(s);
        }

        public String getRspContent() {
            return sb.toString();
        }
    }

    private static class MemcachedResponse extends HttpServletResponseWrapper {

        private PrintWriter writer;

        public MemcachedResponse(HttpServletResponse response, PrintWriter writer) {
            super(response);
            this.writer = writer;
        }

        @Override
        public PrintWriter getWriter() throws IOException {
            return this.writer;
        }
    }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92

分布式Memcached

为了充分发挥Memcached分布式优势,提升服务器响应速度,我们使用Nginx的一致性Hash模块, 将request分布到不同的Memcached Server中, 同时, 对于访问不命中的情况, 也需要后端服务器的支持, 后端服务器在对客户端做出响应的同时, 需要将响应数据按照一致性Hash规则, 将响应数据写入Memcached.

  • 安装Nginx一致性Hash模块 

    • git clone https://github.com/replay/ngx_http_consistent_hash.git
    • ./configure --add-module=/root/src/ngx_http_consistent_hash/
    • make && make install 

    Nginx官方还提供了其他一致性Hash算法的实现, 详细可参考https://www.nginx.com/resources/wiki/modules/

配置Memcached一致性Hash规则

    upstream memcached_servs {
        consistent_hash "$uri?$args";
        server 127.0.0.1:11211;
        server 127.0.0.1:11212;
        server 127.0.0.1:11213;
    }

    server {
        location / {
            set $memcached_key "$uri?$args";
            memcached_pass memcached_servs;
            error_page 404 =200 @send_to_backend;
            index  index.html index.htm;
        }
    }
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15


  • Java: MemcachedFilter 

 注意: Java端的一致性Hash算法的选用需要和Nginx一致, 否则会出现Nginx读取的Memcached与Java写入的Memcached不在同一台的情况.
public class MemcachedFilter implements Filter {

    private MemcachedClient memcached;

    @Override
    public void init(FilterConfig filterConfig) throws ServletException {
        try {
            MemcachedClientBuilder builder = new XMemcachedClientBuilder(AddrUtil.getAddresses("127.0.0.1:11211 127.0.0.1:11212 127.0.0.1:11213"));
            builder.setSessionLocator(new ElectionMemcachedSessionLocator());
            memcached = builder.build();
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
    }

    // ...
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17

你可能感兴趣的:(Nginx)