SOLARIS8中 - IP Network Multipathing

SOLARIS8中IP网络多路径(IP Network Multipathing)的原理与实现

一.简介:
IP 网络多路径是SOLARIS8操作环境的一种新特性,旨在在主机一侧提供网络故障切换和IP连接集合。关键特性包括:
 故障检测及故障切换(failover):当网络适配器(包括主机一侧或与之相连的网络设备一侧)发生故障时,IP Network Multipathing的故障检测特性提供了检测功能,自动把网络访问切换到备用网络适配器,真正消除了主机网络适配器的单点故障。
 恢复检测(failback):当发生故障的网络适配器被修复时,IP 网络多路径的修理检测特性有能力执行检测,并自动把网络访问切换到原来提供服务的主要网络适配器。
 出网负荷分布(outbound load spreading):高层应用可以向许多网络适配器分配网络数据包,以提高网络吞吐量。需要注意的是仅当网络通信量要通过许多连接发送到许多目的地时,才执行出网负荷分布。
二.原理:
在solaris8操作系统中,由in.mpathd (/sbin/in.mpathd)后台进程(daemon)负责故障检测,并根据不同的策略实现了failover和failback。
 检测物理接口的失败:in.mpathd所管理的主机系统的全部和部分网络接口组织成一个multipathing interface group,其中的每一个网络接口分别赋予了测试地址(test address)。在正常情况下,后台进程 in.mpathd不断地通过组中每个网络接口的测试地址向目标主机(target)发送ICMP ECHO包来检测相关网络接口的连通性。其中,目标主机一般选为本网络路由器(router),如果路由器不存在,那么,将选择网络中的主机作为仲裁主机 (arbitrary hosts)。在选择仲裁主机时,in.mpathd向网络上的所有主机发送multicast 数据包,第一台返回响应数据包的主机将被认为是仲裁主机,此仲裁主机就是用来测试interface group中网络连通性的目标主机。在in.mpathd测试主机网络连通性的过程中,如果目标主机连续5次没有响应,in.mpathd认定相关连接已经失败,每次错误检测的缺省时间是10秒,也即大约每两秒检测一次。如果在multipathing interface group中配置了备用网口(standby),那么所有的网络访问将自动切向standby网络接口。
 检测物理接口的恢复:为了检测失败的网络接口是否已经被修复,in.mpathd不断尝试通过该网口的测试地址向目标主机发送检测包,如果能够连续10次收到响应数据包,那么in.mpathd daemon认定该网口已经被修复,随后,所有被转移到备用网口(standby)的服务将自动恢复回原网口。
三.配置:
 故障检测及故障切换(failover)功能配置:
 command line:
ifconfig group
ifconfig group
ifconfig addif -failover deprecated up
ifconfig addif -failover deprecated up
 hostname file
[/etc/hostname.]
group up \
addif -failover deprecated up
[/etc/hostname.]
group up \
addif -failover deprecated up
 如何确认multipathing是否正常工作?
(1) decide IP address and the interface name
Suppose the following IP address assignment.
le0 129.158.70.34, le0:1 129.158.70.82
le1 129.158.70.81, le1:1 129.158.70.83
Suppose the group name is “multipath-test”.
(2) configure IP multipathing
# ifconfig le0 group multipath-test
# ifconfig le1 group multipath-test
# ifconfig le0 addif 129.158.70.82 -failover deprecated up
# ifconfig le1 addif 129.158.70.83 -failover deprecated up
(3) confirm the configuration validity
# ifconfig -a
lo0: flags=1000849 mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
le0: flags=1000843 mtu 1500 index 2
inet 129.158.70.34 netmask ffffff00 broadcast 129.158.70.255 groupname multipath-test ether 8:0:20:1c:0:4f
le0:1: flags=9040843 NOFAILOVER> mtu 1500 index 2 inet 129.158.70.82 netmask ffff0000 broadcast 129.158.255.255
le1: flags=1000843 mtu 1500 index 3
inet 129.158.70.81 netmask ffffff00 broadcast 129.158.70.255
groupname multipath-test ether 8:0:20:1c:0:4f
le1:1: flags=9040843 NOFAILOVER> mtu 1500 index 3 inet 129.158.70.83 netmask ffff0000 broadcast 129.158.255.255
(4) create a session
Create telnet session to another host.
(5) see the physical interface utilization
We can confirm only le0 is used from the netstat output.
# netstat -ran | grep UHA
129.158.118.65 — UHA 3 446 le0
(6) disable le0
When the network adaptor le0 is out of order, the output of ifconfig
changes as follows. “FAILED” is shown in le0. And le1:2 is replacing
it. It can be enabled by pulling out the network cable.
# ifconfig -a
lo0: flags=1000849 mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
le0: flags=19000842
mtu 0 index 2 inet 0.0.0.0 netmask 0 groupname multipath-test
ether 8:0:20:1c:0:4f
le0:1: flags=19040843 NOFAILOVER,FAILED> mtu 1500 index 2 inet 129.158.70.82
netmask ffff0000 broadcast 129.158.255.255
le1: flags=9040843 NOFAILOVER> mtu 1500 index 3 inet 129.158.70.81 netmask ffffff00 broadcast 129.158.70.255 groupname multipath-test ether 8:0:20:1c:0:4f
le1:1: flags=9040843 NOFAILOVER> mtu 1500 index 3 inet 129.158.70.83 netmask ffff0000 broadcast 129.158.255.255
le1:2: flags=1000843 mtu 1500 index 3
inet 129.158.70.34 netmask ffffff00 broadcast 129.158.70.255
The connection with another host can be maintained by switching to le1 interface. # netstat -ran | grep UHA
129.158.118.65 — UHA 2 26 le1
This is the failover.
 恢复检测(failback):
(1) repair le0
When the le0 adaptor is repaired, le0 interface works again as follows.
It can be enabled by connecting the network cable again.
# ifconfig -a
lo0: flags=1000849 mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
le0: flags=1000843 mtu 1500 index 2
inet 129.158.70.34 netmask ffffff00 broadcast 129.158.70.255
groupname multipath-test ether 8:0:20:1c:0:4f
le0:1: flags=9040843 NOFAILOVER> mtu 1500 index 2 inet 129.158.70.82 netmask ffff0000 broadcast 129.158.255.255
le1: flags=9040843 NOFAILOVER> mtu 1500 index 3 inet 129.158.70.81 netmask ffffff00 broadcast 129.158.70.255 groupname multipath-test ether 8:0:20:1c:0:4f
le1:1: flags=9040843 NOFAILOVER> mtu 1500 index 3
这就是failback.
 出网负荷分布(outbound load spreading)功能配制:
 命令行配置方式
ifconfig group
ifconfig group
ifconfig addif -failover deprecated up
ifconfig addif -failover deprecated up
 修改配置文件方式
[/etc/hostname.]
group up \
addif -failover deprecated up
[/etc/hostname.]
group up \
addif -failover deprecated up
 How can we confirm it it is working or not?
Here is an example flow.
(1) decide IP address and the interface name
Suppose the following IP address and the interface name.
Host-A: 129.158.70.34(le0), 129.158.70.81(le1)
Host-B: 129.158.118.65
Host-C: 129.158.118.183
Supoose the group name is “multipath-test”.
(2) create a session
Create telnet session from Host-B to Host-A by explicitly specifying
129.158.70.34.
(3) identify which interface is used on Host-A
# netstat -ran | grep UHA
Destination Gateway Flags Ref Use Interface
——————– ——————– —– —– —— ———
129.158.118.65 129.158.70.246 UHA 2 100 le0 < --(*)
From the output above, we can see only le0 is used.
(4) configure IP multipathing on Host-A
# ifconfig le0 group multipath-test
# ifconfig le1 group multipath-test
# ifconfig le0 addif le0:1 -failover deprecated up
# ifconfig le1 addif lr1:1 -failover deprecated up
(5) confirm the configuration validity
# ifconfig -a
lo0: flags=1000849 mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
le0: flags=1000843 mtu 1500 index 2
inet 129.158.70.34 netmask ffffff00 broadcast 129.158.70.255
groupname multipath-test ether 8:0:20:1c:0:4f
le0:1:flags=9040843 NOFAILOVER> mtu 1500 index 2 inet 129.158.70.82 netmask ffff0000 broadcast 129.158.255.255
le1: flags=1000843 mtu 1500 index 3
inet 129.158.70.81 netmask ffffff00 broadcast 129.158.70.255
groupname multipath-test ether 8:0:20:1c:0:4f
le1:1: flags=9040843 NOFAILOVER> mtu 1500 index 3 inet 129.158.70.83 netmask ffff0000 broadcast 129.158.255.255
(6) create another session
Create telnet session from Host-C to Host-A by explicitly specifying
129.158.70.34.
(7) see the physical interface utilization
# netstat -ran | grep UHA
Destination Gateway Flags Ref Use Interface
——————– ——————– —– —– —— ———
129.158.118.65 129.158.70.246 UHA 3 286 le1 < --(*)
129.158.118.183 129.158.70.246 UHA 2 140 le0 <--(*)
We can confirm both le0 and le1 are automatically used at a time.
This is the outbound load spreading.
If you skip IP multipathing configuration at (4), you will see just le0 interface is used as follows.
# netstat -ran | grep UHA
129.158.118.65 129.158.70.246 UHA 3 140 le0 <--(*)
129.158.118.183 129.158.70.246 UHA 2 53 le0 <--(*)
四.总结:
到目前为止,在solaris8平台上,当遇到诸如负载均衡(load balanceing)、带宽汇聚(bandwidth aggregation)、失败切换(fail over)等需求时,我们还会想到Sun Trunking软件的解决办法。Sun Trunking软件能够在主机端将每个快速以太网或千兆位以太网单独提供的带宽汇聚起来,形成一条逻辑上的高带宽数据链路,使得快速以太网逻辑入口的吞吐量提高八倍,千兆位以太网入口的吞吐量提高一倍,而且Sun Trunking技术还可以增加网络的负载均衡能力以及容错性,极大地提高了整个系统的性能。该软件与一些领先的交换机完全兼容,如:3Com、 Nortel Networks、Cabletron、Cisco、ExbemeNetworks和Foundry Networks等公司的产品。
Sun Trunking的其他特征有:
 Sun Trunking工作在MAC层。
 Sun Trunking能够通过多个网口实现负载均担。
 Sun Trunking需要特定的软件Sun Trunking 1.x来支持。
 Sun Trunking的实现需要指定类型的网卡—QFE、GE 以及兼容Sun Trunking的交换机。
 Sun Trunking使多达八个网络接口捆绑同一个IP地址。
 Sun Trunking所管理的网络接口需要有相同的硬件类型和驱动程序,不能混合管理。
Interface group的相应特征有:
 Interface group工作在IP层。
 同一Interface group可以混合不同类型的网卡,如hme、le、qfe等。
 Interface group所包含的网口可以配置属于同一网段的不同IP地址。

你可能感兴趣的:(网络应用,配置管理,UP,sun,Cisco)