Linux高并发导致(HDS)HNAS NODE Reset

最近接到了一个很奇怪的CASE,希望能在这里描述一下。

NFS_Volume_01 挂载在NODE1上面,然后会有接近20台主机挂载了这个卷。

当有大量并发访问该卷的时候,Node2出现了Reset。这个应该是SMU做出的判断,因为这个时候心跳出现了问题。

证据如下:

Warning: Cluster: Heart beating over high speed interconnect link to Node ID 1 lost.


**********************************插入关键知识*********************************

Linux在6.3版本,以及以后,有一个参数出现了变化:

tcp_max_slot_table_entries=65536

而在6.2之前,这个参数都是128以下的。

可以在Redhat这个Silder里面找到求证:(Page22)

http://www.slideshare.net/sshaaf/red-hat-enterprise-linux-and-nfs

**********************************插入关键知识*********************************


基于上面的分析,我们会有疑问,为什么高并发的访问会导致心跳出现了问题?

顺着这个思路,我找到了另外一个资料:

Introduced in Red Hat RHEL 6.3, clients can dynamically allocate resources for RPC requests.
In versions prior to 6.3 the number of RPC requests were limited to a default of 16 RPC requests, with a maximum of 128 in-flight requests. The change allows clients to handle a significantly higher number of in-flight RPC requests as long as the ARX system continues to respond to the TCP segment of the request. This, in turn, allows a Red Hat client to flood the ARX system with RPC requests. As the number of outstanding requests grow, the ARX propkt service will begin to reach its resource threshold and eventually the service will core.

**********************************插入关键知识*********************************

HDS HNAS里面就是用的F5的ARX技术,这也是他为什么可以实现异构存储下做NFS数据迁移的主要原因。

上面还提到了一个RPC服务,需要特别解释清楚:

http://en.wikipedia.org/wiki/Remote_procedure_call

wiki解释如下:

Sequence of events during an RPC

  1.     The client calls the client stub. The call is a local procedure call, with parameters pushed on to the stack in the normal way.
  2.     The client stub packs the parameters into a message and makes a system call to send the message. Packing the parameters is called marshalling.
  3.     The client's local operating system sends the message from the client machine to the server machine.
  4.     The local operating system on the server machine passes the incoming packets to the server stub.
  5.     The server stub unpacks the parameters from the message. Unpacking the parameters is called unmarshalling.
  6.     Finally, the server stub calls the server procedure. The reply traces the same steps in the reverse direction.

可以看到RPC本身就是Client端发起的请求,然后Server再做出回应。

**********************************插入关键知识*********************************

因此,当有大量的TCP连接数过来的时候,会导致propkt and NSMD这两个进程超高使用,然会直接导致ARX系统僵死。

最终,心跳没有响应,NVRAM也无法同步到对端。所以,这两个故障就是一个原因引起,根本不是HDS提到的是两个故障。

可以参考F5官方站点的说法:

https://support.f5.com/kb/en-us/solutions/public/14000/400/sol14478

https://support.f5.com/kb/en-us/solutions/public/12000/300/sol12323

对于HDS HNAS3080 3090 4040 这些型号,只能在客户端做好规避,暂时没有其他解决办法。


最后,我依然还有个问题:

难道EMC NETAPP没有类似的困境?后面我会继续写一篇博文来分析。



****************************************************手札*****************************************************************

这里又让我想起了FC协议里的Buffer credit, FC对于这种流量控制就做的非常好,一旦对方的Buffer快要撑爆的时候,交换机就会Hold住,一方面会出现报错,另外一方面会C3 discard之后,让发起端重新发包,让整个流程都慢下来,这样可以很好的保证数据一致性。

可以看博科的解释如下:

•Latency bottleneck
•Congestion bottleneck

A latency bottleneck is a port where the offered load exceeds the rate at which the other end of the link can continuously accept traffic, but does not exceed the physical capacity of the link. This condition can be caused by a device attached to the fabric that is slow to process received frames and send back credit returns. A latency bottleneck due to such a device can spread through the fabric and can slow down unrelated flows that share links with the slow flow.

A congestion bottleneck is a port that is unable to transmit frames at the offered rate because the offered rate is greater than the physical data rate of the line. For example, this condition can be caused by trying to transfer data at 8 Gbps over a 4 Gbps ISL.




你可能感兴趣的:(Linux高并发导致(HDS)HNAS NODE Reset)