Nginx可以用来存取静态资源,例如txt,gif,png,它是高效的,高性能的,所以经常也用于CDN场景,动静分离的场景,高效的原因主要体现在sendfile(),同时,tcp_nopush()也可以用来提升的网络效率.
By default, NGINX handles file transmission itself and copies the file into the buffer before sending it. Enabling the sendfile
directive eliminates the step of copying the data into the buffer and enables direct copying data from one file descriptor to another. Alternatively, to prevent one fast connection from entirely occupying the worker process, you can use the sendfile_max_chunk
directive to limit the amount of data transferred in a single sendfile()
call (in this example, to 1
MB):
location /mp3 {
sendfile on;
sendfile_max_chunk 1m;
#...
}
Use the tcp_nopush
directive together with the sendfile
on;
directive. This enables NGINX to send HTTP response headers in one packet right after the chunk of data has been obtained by sendfile()
.
location /mp3 {
sendfile on;
tcp_nopush on;
#...
}
The tcp_nodelay
directive allows override of Nagle’s algorithm, originally designed to solve problems with small packets in slow networks. The algorithm consolidates a number of small packets into a larger one and sends the packet with a 200
ms delay. Nowadays, when serving large static files, the data can be sent immediately regardless of the packet size. The delay also affects online applications (ssh, online games, online trading, and so on). By default, the tcp_nodelay
directive is set to on
which means that the Nagle’s algorithm is disabled. Use this directive only for keepalive connections:
location /mp3 {
tcp_nodelay on;
keepalive_timeout 65;
#...
}
One of the important factors is how fast NGINX can handle incoming connections. The general rule is when a connection is established, it is put into the “listen” queue of a listen socket. Under normal load, either the queue is small or there is no queue at all. But under high load, the queue can grow dramatically, resulting in uneven performance, dropped connections, and increased latency.
To display the current listen queue, run this command:
netstat -Lan
The output might be like the following, which shows that in the listen queue on port 80
there are 10
unaccepted connections against the configured maximum of 128
queued connections. This situation is normal.
Current listen queue sizes (qlen/incqlen/maxqlen)
Listen Local Address
0/0/128 *.12345
10/0/128 *.80
0/0/128 *.8080
In contrast, in the following command the number of unaccepted connections (192
) exceeds the limit of 128
. This is quite common when a web site experiences heavy traffic. To achieve optimal performance, you need to increase the maximum number of connections that can be queued for acceptance by NGINX in both your operating system and the NGINX configuration.
Current listen queue sizes (qlen/incqlen/maxqlen)
Listen Local Address
0/0/128 *.12345
192/0/128 *.80
0/0/128 *.8080
Increase the value of the net.core.somaxconn
kernel parameter from its default value (128
) to a value high enough for a large burst of traffic. In this example, it’s increased to 4096
.
For Linux:
Run the command:
sudo sysctl -w net.core.somaxconn=4096
Use a text editor to add the following line to /etc/sysctl.conf
:
net.core.somaxconn = 4096
If you set the somaxconn
kernel parameter to a value greater than 512
, change the backlog
parameter to the NGINX listen
directive to match:
server {
listen 80 backlog=4096;
# ...
}
This section describes how to configure compression or decompression of responses, as well as sending compressed files.
Compressing responses often significantly reduces the size of transmitted data. However, since compression happens at runtime it can also add considerable processing overhead which can negatively affect performance. NGINX performs compression before sending responses to clients, but does not “double compress” responses that are already compressed (for example, by a proxied server).
To enable compression, include the gzip
directive with the on
parameter.
gzip on;
By default, NGINX compresses responses only with MIME type text/html
. To compress responses with other MIME types, include the gzip_types
directive and list the additional types.
gzip_types text/plain application/xml;
To specify the minimum length of the response to compress, use the gzip_min_length
directive. The default is 20 bytes (here adjusted to 1000):
gzip_min_length 1000;
By default, NGINX does not compress responses to proxied requests (requests that come from the proxy server). The fact that a request comes from a proxy server is determined by the presence of the Via
header field in the request. To configure compression of these responses, use the gzip_proxied
directive. The directive has a number of parameters specifying which kinds of proxied requests NGINX should compress. For example, it is reasonable to compress responses only to requests that will not be cached on the proxy server. For this purpose the gzip_proxied
directive has parameters that instruct NGINX to check the Cache-Control
header field in a response and compress the response if the value is no-cache
, no-store
, or private
. In addition, you must include the expired
parameter to check the value of the Expires
header field. These parameters are set in the following example, along with the auth
parameter, which checks for the presence of the Authorization
header field (an authorized response is specific to the end user and is not typically cached):
gzip_proxied no-cache no-store private expired auth;
As with most other directives, the directives that configure compression can be included in the http
context or in a server
or location
configuration block.
The overall configuration of gzip compression might look like this.
server {
gzip on;
gzip_types text/plain application/xml;
gzip_proxied no-cache no-store private expired auth;
gzip_min_length 1000;
...
}
Some clients do not support responses with the gzip
encoding method. At the same time, it might be desirable to store compressed data, or compress responses on the fly and store them in the cache. To successfully serve both clients that do and do not accept compressed data, NGINX can decompress data on the fly when sending it to the latter type of client.
To enable runtime decompression, use the gunzip
directive.
location /storage/ {
gunzip on;
...
}
The gunzip
directive can be specified in the same context as the gzip
directive:
server {
gzip on;
gzip_min_length 1000;
gunzip on;
...
}
Note that this directive is defined in a separate module that might not be included in an open source NGINX build by default.
To send a compressed version of a file to the client instead of the regular one, set the gzip_static
directive to on
within the appropriate context.
location / {
gzip_static on;
}
In this case, to service a request for /path/to/file, NGINX tries to find and send the file /path/to/file.gz. If the file doesn’t exist, or the client does not support gzip, NGINX sends the uncompressed version of the file.
Note that the gzip_static
directive does not enable on-the-fly compression. It merely uses a file compressed beforehand by any compression tool. To compress content (and not only static content) at runtime, use the gzip
directive.
This directive is defined in a separate module that might not be included in an open source NGINX build by default.
首先创建相关的资源文件,以及文件夹,目录结构如下:
[root@diwuqingrou html]# pwd
/usr/share/nginx/html
[root@diwuqingrou html]# ls
50x.html doc download images index.html
[root@diwuqingrou html]#
修改配置文件default.conf
server {
listen 80;
server_name 193.112.69.164;
#开启sendfile
sendfile on;
#图片的压缩
location ~ .*\.(jpg|gif|png)$ {
gzip on;
#传输协议http的版本
gzip_http_version 1.1;
#压缩级别
gzip_comp_level 2;
#压缩的类型
gzip_types text/plain application/javascript application/x-javascript text/css application/xml text/javascript application/x-httpd-php image/jpeg image/gif image/png;
root /usr/share/nginx/html/images;
}
#文本的压缩
location ~ .*\.(txt|xml)$ {
gzip on;
gzip_http_version 1.1;
gzip_comp_level 1;
gzip_types text/plain application/javascript application/x-javascript text/css application/xml text/javascript application/x-httpd-php image/jpeg image/gif image/png;
root /usr/share/nginx/html/doc;
}
# 文件预读,xx.gz文件下载
location ~ ^/download {
# 压缩gzip文件的预读
gzip_static on;
#
tcp_nopush on;
root /usr/share/nginx/html;
}
error_page 500 502 503 504 404 /50x.html;
location = /50x.html {
root /usr/share/nginx/html;
}
}
location ~ .*.(jpg|gif|png)$ 用于图片的压缩,首先在images目录上传一张图片,接下来访问这张图片:http://127.0.0.1/web.png,查看文件大小
[外链图片转存失败(img-q9bgC75v-1565524293874)(C:\Users\29139\AppData\Roaming\Typora\typora-user-images\1565444715373.png)]
当我们关闭gzip的时候,将所有的location中的gzip on 设置为off,接着再访问,查看文件大小。
[外链图片转存失败(img-wozlbu0d-1565524293876)(C:\Users\29139\AppData\Roaming\Typora\typora-user-images\1565444677740.png)]
244kb与277kb对比,显然压缩了一小部分。
location ~ .*.(txt|xml)$用于文本的压缩,接下来我们在doc目录上传一个txt或者xml文件,以txt文本为例,接下来我们开始测试
当我们不开启gzip的时候,size如下:
[外链图片转存失败(img-WaIJ6yOU-1565524293879)(C:\Users\29139\AppData\Roaming\Typora\typora-user-images\1565445081383.png)]
当我们开启gzip时,size的大小
[外链图片转存失败(img-Lsd3iIeN-1565524293884)(C:\Users\29139\AppData\Roaming\Typora\typora-user-images\1565445023350.png)]
明显是小了很多。
location ~ ^/download 用于预解压,可以将存在的xxx.gz文件预先解压,当我们访问这个文件的时候,只需要访问xxx,既可以将文件下载下来,接下来做测试,将一个文件压缩成.gz格式,然后放在download目录,接下里访问这个文件,我这里是User.java.gz,所以访问http://127.0.0.1/download/t.txt,将会下载这个文件。
[外链图片转存失败(img-98Io6VGn-1565524293885)(C:\Users\29139\AppData\Roaming\Typora\typora-user-images\1565445776515.png)]
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta http-equiv="X-UA-Compatible" content="ie=edge">
<title>test sub modulestitle>
head>
<body>
<a href="https://127.0.0.1netes.io/">127.0.0.1netesa>
<a href="https://www.docker.com/">dockera>
<a href="https://spring.io/">springa>
body>
html>
server {
listen 80;
server_name 193.112.69.164;
location ~ .*\.(htm|html)$ {
expires 24h; #设置缓存的存活时间
root /usr/share/nginx/html;
}
error_page 500 502 503 504 404 /50x.html;
location = /50x.html {
root /usr/share/nginx/html;
}
}
由于浏览器禁止跨域访问,所以当我们需要进行跨域访问的时候,需要设置一些参数。为什么禁止呢?因为容易出现CSRF攻击。
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta http-equiv="X-UA-Compatible" content="ie=edge">
<title>websitetitle>
<script src="http://libs.baidu.com/jquery/2.1.4/jquery.min.js">script>
<script type="text/javascript">
$(document).ready(function(){
$.ajax({
type: "GET",
url: "http://193.112.69.164/user.html",
success: function(data) {
alert("sucess!!!");
},
error: function() {
alert("fail!!!,请刷新再试!");
}
});
});
script>
head>
<body>
<a href="https://127.0.0.1netes.io/">127.0.0.1netesa>
<a href="https://www.docker.com/">dockera>
<a href="https://spring.io/">springa>
body>
html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta http-equiv="X-UA-Compatible" content="ie=edge">
<title>usertitle>
head>
<body>
<p>userp>
body>
html>
修改配置文件default.conf
server {
listen 80;
server_name 193.112.69.164;
location ~ .*\.(htm|html)$ {
add_header Access-Control-Allow-Origin *;
add_header Access-Control-Allow-Methods GET,POST,PUT,DELETE,OPTIONS;
root /usr/share/nginx/html;
}
error_page 500 502 503 504 404 /50x.html;
location = /50x.html {
root /usr/share/nginx/html;
}
}
接下来我们访问website.html就可以将user.html加载进来,查看user.html的response header发现添加了一些关于access的参数,如下
[外链图片转存失败(img-OyF7IFda-1565524293888)(C:\Users\29139\AppData\Roaming\Typora\typora-user-images\1565449494601.png)]
server {
listen 80;
server_name 193.112.69.164;
location ~ .*\.(htm|html)$ {
root /usr/share/nginx/html;
}
location ~ .*\.(jpg|gif|png)$ {
#判断否是这个地址请求,如果不是返回403
valid_referers none blocked 193.112.69.164 ~web\.jpg;
if ($invalid_referer) {
return 403;
}
root /usr/share/nginx/html/images;
}
error_page 500 502 503 504 404 /50x.html;
location = /50x.html {
root /usr/share/nginx/html;
}
}
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta http-equiv="X-UA-Compatible" content="ie=edge">
<title>Documenttitle>
head>
<body>
<img src="http://193.112.69.164/web.jpg"/>
body>
html>
关于正向代理,不做描述,nginx主要用来做反向代理
本文描述代理服务器的基本配置。您将了解如何通过不同的协议将请求从NGINX传递到代理服务器,修改发送到代理服务器的客户机请求头,以及配置来自代理服务器的响应的缓冲。
代理通常用于在多个服务器之间分配负载,无缝地显示来自不同网站的内容,或者通过HTTP以外的协议将处理请求传递给应用服务器。
当NGINX代理请求时,它会将请求发送到指定的代理服务器,获取响应并将其发送回客户端。可以使用指定的协议将请求代理到HTTP服务器(另一个NGINX服务器或任何其他服务器)或非HTTP服务器(可以运行使用特定框架(如PHP或Python)开发的应用程序)。支持的协议包括FastCGI,uwsgi,SCGI和memcached。
要将请求传递给HTTP代理服务器,需要在location中指定proxy_pass指令。例如:
location /some/path/ {
proxy_pass http://www.example.com/link/;
}
此示例配置导致将在此位置处理的所有请求传递到指定地址的代理服务器。该地址可以指定为域名或IP地址。地址还可能包含一个端口:
location ~ \.php {
proxy_pass http://127.0.0.1:8000;
}
请注意,在上面的第一个示例中,代理服务器的地址后跟URI /link/
。如果URI与地址一起指定,则它将替换与location参数匹配的请求URI部分。例如,这里带有/some/path/page.html
URI 的请求将被代理http://www.example.com/link/page.html
。如果指定的地址没有URI,或者无法确定要替换的URI部分,则传递完整的请求URI(可能已修改)。
要将请求传递给非HTTP代理服务器,**_pass
应使用相应的指令:
fastcgi_pass
将请求传递给FastCGI服务器uwsgi_pass
将请求传递给uwsgi服务器scgi_pass
将请求传递给SCGI服务器memcached_pass
将请求传递给memcached服务器请注意,在这些情况下,指定地址的规则可能不同。您可能还需要将其他参数传递给服务器(有关更多详细信息,请参阅参考文档)。
该proxy_pass
指令还可以指向一组已命名的服务器。在这种情况下,根据指定的方法在组中的服务器之间分配请求。
默认认情况下,NGINX在代理请求中重新定义两个头字段“Host”和“Connection”,并删除其值为空字符串的头字段。“Host”设置为$proxy_host
变量,“Connection”设置为close
。
要更改这些设置,以及修改其他标头字段,请使用该proxy_set_header
指令。该指令可以在一个location
或多个中指定。它也可以在特定server
上下文或http
块中指定。例如:
location /some/path/ {
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_pass http://localhost:8000;
}
在此配置中,“Host”字段设置为$ host变量。
要防止将头字段传递给代理服务器,请将其设置为空字符串,如下所示:
location /some/path/ {
proxy_set_header Accept-Encoding "";
proxy_pass http://localhost:8000;
}
默认情况下,NGINX会缓冲代理服务器的响应。响应存储在内部缓冲区中,并且在收到整个响应之前不会发送到客户端。缓冲有助于优化慢客户端的性能,如果响应从NGINX同步传递到客户端,则会浪费代理服务器时间。但是,当启用缓冲时,NGINX允许代理服务器快速处理响应,而NGINX将响应存储的时间与客户端下载它们所需的时间相同。
负责启用和禁用缓冲的指令是proxy_buffering
。默认情况下,它设置on
为启用缓冲。
该proxy_buffers
指令控制为请求分配的缓冲区的大小和数量。来自代理服务器的响应的第一部分存储在单独的缓冲区中,其大小由proxy_buffer_size
指令设置。该部分通常包含一个相对较小的响应头,并且可以比其余响应的缓冲区小。
在以下示例中,将增加默认缓冲区数,并使响应的第一部分的缓冲区大小小于默认值。
location /some/path/ {
proxy_buffers 16 4k;
proxy_buffer_size 2k;
proxy_pass http://localhost:8000;
}
如果禁用缓冲,则响应将在从代理服务器接收响应时同步发送到客户端。对于需要尽快开始接收响应的快速交互式客户端,此行为可能是合乎需要的。
要禁用特定位置的缓冲,请将proxy_buffering
指令放在location
带off
参数的位置,如下所示:
location /some/path/ {
proxy_buffering off;
proxy_pass http://localhost:8000;
}
在这种情况下,NGINX仅使用配置的缓冲区proxy_buffer_size
来存储响应的当前部分。
反向代理的常见用途是提供负载平衡。
如果您的代理服务器有多个网络接口,有时您可能需要选择特定的源IP地址来连接到代理服务器或上游。如果NGINX后面的代理服务器配置为接受来自特定IP网络或IP地址范围的连接,这可能很有用。
指定proxy_bind
必要网络接口的指令和IP地址:
location /app1/ {
proxy_bind 127.0.0.1;
proxy_pass http://example.com/app1/;
}
location /app2/ {
proxy_bind 127.0.0.2;
proxy_pass http://example.com/app2/;
}
IP地址也可以用变量指定。例如,$server_addr
变量传递接受请求的网络接口的IP地址:
location /app3/ {
proxy_bind $server_addr;
proxy_pass http://example.com/app3/;
}
关于代理模块的相关参数可以通过查看官方文档理解,ngx_http_proxy_module地址
首先修改配置文件,default.conf
server {
listen 80;
server_name localhost;
location / {
#将/的请求代理到8080端口
proxy_pass http://127.0.0.1:8080;
#跳转重定向
proxy_redirect default;
#设置相应的请求头信息,也就是在请求http://127.0.0.1:8080时添加的相关的头信息
proxy_set_header Host $http_host;
proxy_set_header X-Real-IP $remote_addr;
#设置连接超时相关的参数
proxy_connect_timeout 30; #TCP连接
proxy_send_timeout 60;
proxy_read_timeout 60;
#配置缓冲区相关的信息,减少频繁的IO损耗,首先存在内存中,当内存不够用时,将会存在磁盘中
proxy_buffer_size 32k; # 设置缓冲区的大小
proxy_buffering on; #启用缓冲
proxy_buffers 4 128k;
proxy_busy_buffers_size 256k;
proxy_max_temp_file_size 256k;
}
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root /usr/share/nginx/html;
}
}
此时我们再访问nginx的首页,将跳转的http://127.0.0.1:8080,在8080运行着一个tomcat。
Load balancing across multiple application instances is a commonly used technique for optimizing resource utilization, maximizing throughput, reducing latency, and ensuring fault‑tolerant configurations.
Watch the NGINX Plus for Load Balancing and Scaling webinar on demand for a deep dive on techniques that NGINX users employ to build large‑scale, highly available web services.
NGINX and NGINX Plus can be used in different deployment scenarios as a very efficient HTTP load balancer.
To start using NGINX Plus or NGINX open source to load balance HTTP traffic to a group of servers, first you need to define the group with the upstream
directive. The directive is placed in the http
context.
Servers in the group are configured using the server
directive (not to be confused with the server
block that defines a virtual server running on NGINX). For example, the following configuration defines a group named backend and consists of three server configurations (which may resolve in more than three actual servers):
http {
upstream backend {
server backend1.example.com weight=5;
server backend2.example.com;
server 192.0.0.1 backup;
}
}
To pass requests to a server group, the name of the group is specified in the proxy_pass
directive (or the fastcgi_pass
, memcached_pass
, scgi_pass
, or uwsgi_pass
directives for those protocols.) In the next example, a virtual server running on NGINX passes all requests to the backend upstream group defined in the previous example:
server {
location / {
proxy_pass http://backend;
}
}
The following example combines the two snippets above and shows how to proxy HTTP requests to the backend server group. The group consists of three servers, two of them running instances of the same application while the third is a backup server. Because no load‑balancing algorithm is specified in the upstream
block, NGINX uses the default algorithm, Round Robin:
http {
upstream backend {
server backend1.example.com;
server backend2.example.com;
server 192.0.0.1 backup;
}
server {
location / {
proxy_pass http://backend;
}
}
}
NGINX open source supports four load‑balancing methods, and NGINX Plus adds two more methods:
Round Robin – Requests are distributed evenly across the servers, with server weights taken into consideration. This method is used by default (there is no directive for enabling it):
upstream backend {
# no load balancing method is specified for Round Robin
server backend1.example.com;
server backend2.example.com;
}
Least Connections – A request is sent to the server with the least number of active connections, again with server weights taken into consideration:
upstream backend {
least_conn;
server backend1.example.com;
server backend2.example.com;
}
IP Hash – The server to which a request is sent is determined from the client IP address. In this case, either the first three octets of the IPv4 address or the whole IPv6 address are used to calculate the hash value. The method guarantees that requests from the same address get to the same server unless it is not available.
upstream backend {
ip_hash;
server backend1.example.com;
server backend2.example.com;
}
If one of the servers needs to be temporarily removed from the load‑balancing rotation, it can be marked with the down
parameter in order to preserve the current hashing of client IP addresses. Requests that were to be processed by this server are automatically sent to the next server in the group:
upstream backend {
server backend1.example.com;
server backend2.example.com;
server backend3.example.com down;
}
Generic Hash – The server to which a request is sent is determined from a user‑defined key which can be a text string, variable, or a combination. For example, the key may be a paired source IP address and port, or a URI as in this example:
upstream backend {
hash $request_uri consistent;
server backend1.example.com;
server backend2.example.com;
}
The optional consistent
parameter to the hash
directive enables ketama consistent‑hash load balancing. Requests are evenly distributed across all upstream servers based on the user‑defined hashed key value. If an upstream server is added to or removed from an upstream group, only a few keys are remapped which minimizes cache misses in the case of load‑balancing cache servers or other applications that accumulate state.
Least Time (NGINX Plus only) – For each request, NGINX Plus selects the server with the lowest average latency and the lowest number of active connections, where the lowest average latency is calculated based on which of the following parameters to the least_time
directive is included:
header
– Time to receive the first byte from the serverlast_byte
– Time to receive the full response from the serverlast_byte inflight
– Time to receive the full response from the server, taking into account incomplete requestsupstream backend {
least_time header;
server backend1.example.com;
server backend2.example.com;
}
Random – Each request will be passed to a randomly selected server. If the two
parameter is specified, first, NGINX randomly selects two servers taking into account server weights, and then chooses one of these servers using the specified method:
least_conn
– The least number of active connectionsleast_time=header
(NGINX Plus) – The least average time to receive the response header from the server ($upstream_header_time
)least_time=last_byte
(NGINX Plus) – The least average time to receive the full response from the server ($upstream_response_time
)upstream backend {
random two least_time=last_byte;
server backend1.example.com;
server backend2.example.com;
server backend3.example.com;
server backend4.example.com;
}
The Random load balancing method should be used for distributed environments where multiple load balancers are passing requests to the same set of backends. For environments where the load balancer has a full view of all requests, use other load balancing methods, such as round robin, least connections and least time.
Note: When configuring any method other than Round Robin, put the corresponding directive (hash
, ip_hash
, least_conn
, least_time
, or random
) above the list of server
directives in the upstream {}
block.
By default, NGINX distributes requests among the servers in the group according to their weights using the Round Robin method. The weight
parameter to the server
directive sets the weight of a server; the default is 1:
upstream backend {
server backend1.example.com weight=5;
server backend2.example.com;
server 192.0.0.1 backup;
}
In the example, backend1.example.com has weight 5
; the other two servers have the default weight (1
), but the one with IP address 192.0.0.1
is marked as a backup
server and does not receive requests unless both of the other servers are unavailable. With this configuration of weights, out of every 6
requests, 5
are sent to backend1.example.com and 1
to backend2.example.com.
The server slow‑start feature prevents a recently recovered server from being overwhelmed by connections, which may time out and cause the server to be marked as failed again.
In NGINX Plus, slow‑start allows an upstream server to gradually recover its weight from 0
to its nominal value after it has been recovered or became available. This can be done with the slow_start
parameter to the server
directive:
upstream backend {
server backend1.example.com slow_start=30s;
server backend2.example.com;
server 192.0.0.1 backup;
}
The time value (here, 30
seconds) sets the time during which NGINX Plus ramps up the number of connections to the server to the full value.
Note that if there is only a single server in a group, the max_fails
, fail_timeout
, and slow_start
parameters to the server
directive are ignored and the server is never considered unavailable.
Session persistence means that NGINX Plus identifies user sessions and routes all requests in a given session to the same upstream server.
NGINX Plus supports three session persistence methods. The methods are set with the sticky
directive. (For session persistence with NGINX open source, use the hash
or ip_hash
directive as described above.)
Sticky cookie – NGINX Plus adds a session cookie to the first response from the upstream group and identifies the server that sent the response. The client’s next request contains the cookie value and NGINX Plus route the request to the upstream server that responded to the first request:
upstream backend {
server backend1.example.com;
server backend2.example.com;
sticky cookie srv_id expires=1h domain=.example.com path=/;
}
In the example, the srv_id
parameter sets the name of the cookie. The optional expires
parameter sets the time for the browser to keep the cookie (here, 1
hour). The optional domain
parameter defines the domain for which the cookie is set, and the optional path
parameter defines the path for which the cookie is set. This is the simplest session persistence method.
Sticky route – NGINX Plus assigns a “route” to the client when it receives the first request. All subsequent requests are compared to the route
parameter of the server
directive to identify the server to which the request is proxied. The route information is taken from either a cookie or the request URI.
upstream backend {
server backend1.example.com route=a;
server backend2.example.com route=b;
sticky route $route_cookie $route_uri;
}
Cookie learn method – NGINX Plus first finds session identifiers by inspecting requests and responses. Then NGINX Plus “learns” which upstream server corresponds to which session identifier. Generally, these identifiers are passed in a HTTP cookie. If a request contains a session identifier already “learned”, NGINX Plus forwards the request to the corresponding server:
upstream backend {
server backend1.example.com;
server backend2.example.com;
sticky learn
create=$upstream_cookie_examplecookie
lookup=$cookie_examplecookie
zone=client_sessions:1m
timeout=1h;
}
In the example, one of the upstream servers creates a session by setting the cookie EXAMPLECOOKIE
in the response.
The mandatory create
parameter specifies a variable that indicates how a new session is created. In the example, new sessions are created from the cookie EXAMPLECOOKIE
sent by the upstream server.
The mandatory lookup
parameter specifies how to search for existing sessions. In our example, existing sessions are searched in the cookie EXAMPLECOOKIE
sent by the client.
The mandatory zone
parameter specifies a shared memory zone where all information about sticky sessions is kept. In our example, the zone is named client_sessions and is 1
megabyte in size.
This is a more sophisticated session persistence method than the previous two as it does not require keeping any cookies on the client side: all info is kept server‑side in the shared memory zone.
If there are several NGINX instances in a cluster that use the “sticky learn” method, it is possible to sync the contents of their shared memory zones on conditions that:
zone_sync
functionality is configured on each instancesync
parameter is specified sticky learn
create=$upstream_cookie_examplecookie
lookup=$cookie_examplecookie
zone=client_sessions:1m
timeout=1h
sync;
}
See Runtime State Sharing in a Cluster for details.
With NGINX Plus, it is possible to limit the number of connections to an upstream server by specifying the maximum number with the max_conns
parameter.
If the max_conns
limit has been reached, the request is placed in a queue for further processing, provided that the queue
directive is also included to set the maximum number of requests that can be simultaneously in the queue:
upstream backend {
server backend1.example.com max_conns=3;
server backend2.example.com;
queue 100 timeout=70;
}
If the queue is filled up with requests or the upstream server cannot be selected during the timeout specified by the optional timeout
parameter, the client receives an error.
Note that the max_conns
limit is ignored if there are idle keepalive connections opened in other worker processes. As a result, the total number of connections to the server might exceed the max_conns
value in a configuration where the memory is shared with multiple worker processes.
NGINX can continually test your HTTP upstream servers, avoid the servers that have failed, and gracefully add the recovered servers into the load‑balanced group.
See HTTP Health Checks for instructions how to configure health checks for HTTP.
If an upstream
block does not include the zone
directive, each worker process keeps its own copy of the server group configuration and maintains its own set of related counters. The counters include the current number of connections to each server in the group and the number of failed attempts to pass a request to a server. As a result, the server group configuration cannot be modified dynamically.
When the zone
directive is included in an upstream
block, the configuration of the upstream group is kept in a memory area shared among all worker processes. This scenario is dynamically configurable, because the worker processes access the same copy of the group configuration and utilize the same related counters.
The zone
directive is mandatory for active health checks and dynamic reconfiguration of the upstream group. However, other features of upstream groups can benefit from the use of this directive as well.
For example, if the configuration of a group is not shared, each worker process maintains its own counter for failed attempts to pass a request to a server (set by the max_fails
parameter). In this case, each request gets to only one worker process. When the worker process that is selected to process a request fails to transmit the request to a server, other worker processes don’t know anything about it. While some worker process can consider a server unavailable, others might still send requests to this server. For a server to be definitively considered unavailable, the number of failed attempts during the timeframe set by the fail_timeout
parameter must equal max_fails
multiplied by the number of worker processes. On the other hand, the zone
directive guarantees the expected behavior.
Similarly, the Least Connections load‑balancing method might not work as expected without the zone
directive, at least under low load. This method passes a request to the server with the smallest number of active connections. If the configuration of the group is not shared, each worker process uses its own counter for the number of connections and might send a request to the same server that another worker process just sent a request to. However, you can increase the number of requests to reduce this effect. Under high load requests are distributed among worker processes evenly, and the Least Connections method works as expected.
It is not possible to recommend an ideal memory‑zone size, because usage patterns vary widely. The required amount of memory is determined by which features (such as session persistence, health checks, or DNS re‑resolving) are enabled and how the upstream servers are identified.
As an example, with the sticky_route
session persistence method and a single health check enabled, a 256‑KB zone can accommodate information about the indicated number of upstream servers:
The configuration of a server group can be modified at runtime using DNS.
For servers in an upstream group that are identified with a domain name in the server
directive, NGINX Plus can monitor changes to the list of IP addresses in the corresponding DNS record, and automatically apply the changes to load balancing for the upstream group, without requiring a restart. This can be done by including the resolver
directive in the http
block along with the resolve
parameter to the server
directive:
http {
resolver 10.0.0.1 valid=300s ipv6=off;
resolver_timeout 10s;
server {
location / {
proxy_pass http://backend;
}
}
upstream backend {
zone backend 32k;
least_conn;
# ...
server backend1.example.com resolve;
server backend2.example.com resolve;
}
}
In the example, the resolve
parameter to the server
directive tells NGINX Plus to periodically re‑resolve the backend1.example.com and backend2.example.com domain names into IP addresses.
The resolver
directive defines the IP address of the DNS server to which NGINX Plus sends requests (here, 10.0.0.1). By default, NGINX Plus re‑resolves DNS records at the frequency specified by time‑to‑live (TTL) in the record, but you can override the TTL value with the valid
parameter; in the example it is 300 seconds, or 5 minutes.
The optional ipv6=off
parameter means only IPv4 addresses are used for load balancing, though resolving of both IPv4 and IPv6 addresses is supported by default.
If a domain name resolves to several IP addresses, the addresses are saved to the upstream configuration and load balanced. In our example, the servers are load balanced according to the Least Connections load‑balancing method. If the list of IP addresses for a server has changed, NGINX Plus immediately starts load balancing across the new set of addresses.
In NGINX Plus R7 and later, NGINX Plus can proxy Microsoft Exchange traffic to a server or a group of servers and load balance it.
To set up load balancing of Microsoft Exchange servers:
In a location
block, configure proxying to the upstream group of Microsoft Exchange servers with the proxy_pass
directive:
location / {
proxy_pass https://exchange;
# ...
}
In order for Microsoft Exchange connections to pass to the upstream servers, in the location
block set the proxy_http_version
directive value to 1.1
, and the proxy_set_header
directive to Connection ""
, just like for a keepalive connection:
location / {
# ...
proxy_http_version 1.1;
proxy_set_header Connection "";
# ...
}
In the http
block, configure a upstream group of Microsoft Exchange servers with an upstream
block named the same as the upstream group specified with the proxy_pass
directive in Step 1. Then specify the ntlm
directive to allow the servers in the group to accept requests with NTLM authentication:
http {
# ...
upstream exchange {
zone exchange 64k;
ntlm;
# ...
}
}
Add Microsoft Exchange servers to the upstream group and optionally specify a load‑balancing method:
http {
# ...
upstream exchange {
zone exchange 64k;
ntlm;
server exchange1.example.com;
server exchange2.example.com;
# ...
}
}
http {
# ...
upstream exchange {
zone exchange 64k;
ntlm;
server exchange1.example.com;
server exchange2.example.com;
}
server {
listen 443 ssl;
ssl_certificate /etc/nginx/ssl/company.com.crt;
ssl_certificate_key /etc/nginx/ssl/company.com.key;
ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
location / {
proxy_pass https://exchange;
proxy_http_version 1.1;
proxy_set_header Connection "";
}
}
}
For more information about configuring Microsoft Exchange and NGINX Plus, see the Load Balancing Microsoft Exchange Servers with NGINX Plus deployment guide.
With NGINX Plus, the configuration of an upstream server group can be modified dynamically using the NGINX Plus API. A configuration command can be used to view all servers or a particular server in a group, modify parameter for a particular server, and add or remove servers. For more information and instructions, see Configuring Dynamic Load Balancing with the NGINX Plus API.
When caching is enabled, NGINX Plus saves responses in a disk cache and uses them to respond to clients without having to proxy requests for the same content every time.
To learn more about NGINX Plus’s caching capabilities, watch the Content Caching with NGINX webinar on demand and get an in‑depth review of features such as dynamic content caching, cache purging, and delayed caching.
To enable caching, include the proxy_cache_path
directive in the top‑level http {}
context. The mandatory first parameter is the local filesystem path for cached content, and the mandatory keys_zone
parameter defines the name and size of the shared memory zone that is used to store metadata about cached items:
http {
...
proxy_cache_path /data/nginx/cache keys_zone=one:10m;
}
Then include the proxy_cache
directive in the context (protocol type, virtual server, or location) for which you want to cache server responses, specifying the zone name defined by the keys_zone
parameter to the proxy_cache_path
directive (in this case, one
):
http {
...
proxy_cache_path /data/nginx/cache keys_zone=one:10m;
server {
proxy_cache mycache;
location / {
proxy_pass http://localhost:8000;
}
}
}
Note that the size defined by the keys_zone
parameter does not limit the total amount of cached response data. Cached responses themselves are stored with a copy of the metadata in specific files on the filesystem. To limit the amount of cached response data, include the max_size
parameter to the proxy_cache_path
directive. (But note that the amount of cached data can temporarily exceed this limit, as described in the following section.)
There are two additional NGINX processes involved in caching:
The cache manager is activated periodically to check the state of the cache. If the cache size exceeds the limit set by the max_size
parameter to the proxy_cache_path
directive, the cache manager removes the data that was accessed least recently. As previously mentioned, the amount of cached data can temporarily exceed the limit during the time between cache manager activations.
The
cache loader
runs only once, right after NGINX starts. It loads metadata about previously cached data into the shared memory zone. Loading the whole cache at once could consume sufficient resources to slow NGINX performance during the first few minutes after startup. To avoid this, configure iterative loading of the cache by including the following parameters to the
proxy_cache_path
directive:
loader_threshold
– Duration of an iteration, in milliseconds (by default, 200
)loader_files
– Maximum number of items loaded during one iteration (by default, 100
)loader_sleeps
– Delay between iterations, in milliseconds (by default, 50
)In the following example, iterations last 300
milliseconds or until 200
items have been loaded:
proxy_cache_path /data/nginx/cache keys_zone=one:10m loader_threshold=300 loader_files=200;
By default, NGINX Plus caches all responses to requests made with the HTTP GET
and HEAD
methods the first time such responses are received from a proxied server. As the key (identifier) for a request, NGINX Plus uses the request string. If a request has the same key as a cached response, NGINX Plus sends the cached response to the client. You can include various directives in the http {}
, server {}
, or location {}
context to control which responses are cached.
To change the request characteristics used in calculating the key, include the proxy_cache_key
directive:
proxy_cache_key "$host$request_uri$cookie_user";
To define the minimum number of times that a request with the same key must be made before the response is cached, include the proxy_cache_min_uses
directive:
proxy_cache_min_uses 5;
To cache responses to requests with methods other than GET
and HEAD
, list them along with GET
and HEAD
as parameters to the proxy_cache_methods
directive:
proxy_cache_methods GET HEAD POST;
By default, responses remain in the cache indefinitely. They are removed only when the cache exceeds the maximum configured size, and then in order by length of time since they were last requested. You can set how long cached responses are considered valid, or even whether they are used at all, by including directives in the http {}
, server {}
, or location {}
context:
To limit how long cached responses with specific status codes are considered valid, include the proxy_cache_valid
directive:
proxy_cache_valid 200 302 10m;
proxy_cache_valid 404 1m;
In this example, responses with the code 200
or 302
are considered valid for 10 minutes, and responses with code 404
are valid for 1 minute. To define the validity time for responses with all status codes, specify any
as the first parameter:
proxy_cache_valid any 5m;
To define conditions under which NGINX Plus does not send cached responses to clients, include the proxy_cache_bypass
directive. Each parameter defines a condition and consists of a number of variables. If at least one parameter is not empty and does not equal “0” (zero), NGINX Plus does not look up the response in the cache, but instead forwards the request to the backend server immediately.
proxy_cache_bypass $cookie_nocache $arg_nocache$arg_comment;
To define conditions under which NGINX Plus does not cache a response at all, include the proxy_no_cache
directive, defining parameters in the same way as for the proxy_cache_bypass
directive.
proxy_no_cache $http_pragma $http_authorization;
NGINX makes it possible to remove outdated cached files from the cache. This is necessary for removing outdated cached content to prevent serving old and new versions of web pages at the same time. The cache is purged upon receiving a special “purge” request that contains either a custom HTTP header, or the HTTP PURGE
method.
Let’s set up a configuration that identifies requests that use the HTTP PURGE
method and deletes matching URLs.
In the http {}
context, create a new variable, for example, $purge_method
, that depends on the $request_method
variable:
http {
...
map $request_method $purge_method {
PURGE 1;
default 0;
}
}
In the location {}
block where caching is configured, include the proxy_cache_purge
directive to specify a condition for cache‑purge requests. In our example, it is the $purge_method
configured in the previous step:
server {
listen 80;
server_name www.example.com;
location / {
proxy_pass https://localhost:8002;
proxy_cache mycache;
proxy_cache_purge $purge_method;
}
}
When the proxy_cache_purge
directive is configured, you need to send a special cache‑purge request to purge the cache. You can issue purge requests using a range of tools, including the curl
command as in this example:
$ curl -X PURGE -D – "https://www.example.com/*"
HTTP/1.1 204 No Content
Server: nginx/1.15.0
Date: Sat, 19 May 2018 16:33:04 GMT
Connection: keep-alive
In the example, the resources that have a common URL part (specified by the asterisk wildcard) are purged. However, such cache entries are not removed completely from the cache: they remain on disk until they are deleted for either inactivity (as determined by the inactive
parameter to the proxy_cache_path
directive) or by the cache purger (enabled with the purger
parameter to proxy_cache_path
), or a client attempts to access them.
We recommend that you limit the number of IP addresses that are allowed to send a cache‑purge request:
geo $purge_allowed {
default 0; # deny from other
10.0.0.1 1; # allow from localhost
192.168.0.0/24 1; # allow from 10.0.0.0/24
}
map $request_method $purge_method {
PURGE $purge_allowed;
default 0;
}
In this example, NGINX checks if the PURGE
method is used in a request, and, if so, analyzes the client IP address. If the IP address is whitelisted, then the $purge_method
is set to $purge_allowed
: 1
permits purging, and 0
denies it.
To completely remove cache files that match an asterisk, activate a special cache purger
process that permanently iterates through all cache entries and deletes the entries that match the wildcard key. Include the purger
parameter to the proxy_cache_path
directive in the http {}
context:
proxy_cache_path /data/nginx/cache levels=1:2 keys_zone=mycache:10m purger=on;
http {
...
proxy_cache_path /data/nginx/cache levels=1:2 keys_zone=mycache:10m purger=on;
map $request_method $purge_method {
PURGE 1;
default 0;
}
server {
listen 80;
server_name www.example.com;
location / {
proxy_pass https://localhost:8002;
proxy_cache mycache;
proxy_cache_purge $purge_method;
}
}
geo $purge_allowed {
default 0;
10.0.0.1 1;
192.168.0.0/24 1;
}
map $request_method $purge_method {
PURGE $purge_allowed;
default 0;
}
}
The initial cache fill operation sometimes takes quite a long time, especially for large files. For example, when a video file starts downloading to fulfill the initial request for a part of the file, subsquent requests have to wait for the entire file to be downloaded and put into the cache.
NGINX makes it possible to cache such range requests and gradually fill the cache with the Cache Slice module, which divides files into smaller “slices”. Each range request chooses particular slices that cover the requested range and, if this range is still not cached, put it into the cache. All other requests for these slices take the data from the cache.
To enable byte‑range caching:
Make sure NGINX is compiled with the Cache Slice module.
Specify the size of the slice with the slice
directive:
location / {
slice 1m;
}
Choose a slice size that makes slice downloading fast. If the size is too small, memory usage might be excessive and a large number of file descriptors opened while processing the request, while an excessively large size might cause latency.
Include the $slice_range
variable to the cache key:
proxy_cache_key $uri$is_args$args$slice_range;
Enable caching of responses with the 206
status code:
proxy_cache_valid 200 206 1h;
Enable passing of range requests to the proxied server by setting the $slice_range
variable in the Range
header field:
proxy_set_header Range $slice_range;
Here’s the full configuration:
location / {
slice 1m;
proxy_cache cache;
proxy_cache_key $uri$is_args$args$slice_range;
proxy_set_header Range $slice_range;
proxy_cache_valid 200 206 1h;
proxy_pass http://localhost:8000;
}
Note that if slice caching is turned on, the initial file must not be changed.
The following sample configuration combines some of the caching options described above.
http {
...
proxy_cache_path /data/nginx/cache keys_zone=one:10m loader_threshold=300
loader_files=200 max_size=200m;
server {
listen 8080;
proxy_cache mycache;
location / {
proxy_pass http://backend1;
}
location /some/path {
proxy_pass http://backend2;
proxy_cache_valid any 1m;
proxy_cache_min_uses 3;
proxy_cache_bypass $cookie_nocache $arg_nocache$arg_comment;
}
}
}
In this example, two locations use the same cache but in different ways.
Because responses from backend1
rarely change, no cache‑control directives are included. Responses are cached the first time a request is made, and remain valid indefinitely.
In contrast, responses to requests served by backend2
change frequently, so they are considered valid for only 1 minute and aren’t cached until the same request is made 3 times. Moreover, if a request matches the conditions defined by the proxy_cache_bypass
directive, NGINX Plus immediately passes the request to backend2
without looking for the corresponding response in the cache.
he initial file must not be changed.
The following sample configuration combines some of the caching options described above.
http {
...
proxy_cache_path /data/nginx/cache keys_zone=one:10m loader_threshold=300
loader_files=200 max_size=200m;
server {
listen 8080;
proxy_cache mycache;
location / {
proxy_pass http://backend1;
}
location /some/path {
proxy_pass http://backend2;
proxy_cache_valid any 1m;
proxy_cache_min_uses 3;
proxy_cache_bypass $cookie_nocache $arg_nocache$arg_comment;
}
}
}
In this example, two locations use the same cache but in different ways.
Because responses from backend1
rarely change, no cache‑control directives are included. Responses are cached the first time a request is made, and remain valid indefinitely.
In contrast, responses to requests served by backend2
change frequently, so they are considered valid for only 1 minute and aren’t cached until the same request is made 3 times. Moreover, if a request matches the conditions defined by the proxy_cache_bypass
directive, NGINX Plus immediately passes the request to backend2
without looking for the corresponding response in the cache.