两台IBM x3650M3,操作系统CentOS5.9 x64 ,连接一台IBM DS3400存储,系统底层采用GFS文件系统实现文件共享,数据库是另一套独立的oracle rac集群,本架构无需考虑数据库的问题。
GFS文件系统及相关配置见上一文IBM x3650M3+GFS+IPMI fence生产环境配置一例。本文是在上一文的基础上进行延伸。 两台服务器主机名分别为node01,node02,因为应用架构相关简单,而且服务器资源有限,通过两台服务器实现双机互备模式高可用性架构。本文出自:http://koumm.blog.51cto.com/
IBM x3650M3+GFS+IPMI fence生产环境配置一例
http://koumm.blog.51cto.com/703525/1544971
架构图如下:
说明:IBM服务器需要将专用IMM2口或标注有SYSTEM MGMT网口接入交换机, 与本地IP地址同段。
ipmi: 10.10.10.85/24
eth1: 192.168.233.83/24
eth1:0 10.10.10.87/24
ipmi: 10.10.10.86/24
eth1: 192.168.233.84/24
eth1:0 10.10.10.88/24
# cat /etc/hosts
192.168.233.83 node01
192.168.233.84 node02
192.168.233.90 vip
10.10.10.85 node01_ipmi
10.10.10.86 node02_ipmi
实现一个VIP出现,出例采用VIP地址是192.168.233.90。
说明:keepalive-1.2.12经过安装没有问题。
wget http://www.keepalived.org/software/keepalived-1.2.12.tar.gz tar zxvf keepalived-1.2.12.tar.gz cd keepalived-1.2.12 ./configure --prefix=/usr/local/keepalived make && make install cp /usr/local/keepalived/sbin/keepalived /usr/sbin/ cp /usr/local/keepalived/etc/sysconfig/keepalived /etc/sysconfig/ cp /usr/local/keepalived/etc/rc.d/init.d/keepalived /etc/init.d/ mkdir /etc/keepalived
修改配置文件, 绑定的网卡是eth1
说明: 从机就是优先级与本机IP不一样外,其它都是一样。
# vi /etc/keepalived/keepalived.conf ! Configuration File for keepalived global_defs { notification_email { [email protected] } notification_email_from [email protected] smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id LVS_DEVEL } vrrp_instance VI_1 { state MASTER interface eth1 virtual_router_id 51 mcast_src_ip 192.168.233.83 priority 100 advert_int 1 authentication { auth_type PASS auth_pass 876543 } virtual_ipaddress { 192.168.233.90 } }
# vi /etc/keepalived/keepalived.conf ! Configuration File for keepalived global_defs { notification_email { [email protected] } notification_email_from [email protected] smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id LVS_DEVEL } vrrp_instance VI_1 { state MASTER interface eth1 virtual_router_id 51 mcast_src_ip 192.168.233.84 priority 99 advert_int 1 authentication { auth_type PASS auth_pass 876543 } virtual_ipaddress { 192.168.233.90 } }
service keepalived start chkconfig --add keepalived chkconfig keepalived on
主机: 观察VIP地址如下:
[root@node01 /]# service keepalived start Starting keepalived: [ OK ][root@node01 /]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo 2: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000 link/ether e4:1f:13:65:0e:a0 brd ff:ff:ff:ff:ff:ff 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000 link/ether e4:1f:13:65:0e:a2 brd ff:ff:ff:ff:ff:ff inet 192.168.233.83/24 brd 192.168.230.255 scope global eth1 inet 10.10.10.87/24 brd 10.10.10.255 scope global eth1:0 inet 192.168.233.85/32 scope global eth1 4: usb0: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast qlen 1000 link/ether e6:1f:13:57:0e:a3 brd ff:ff:ff:ff:ff:ff [root@node01 /]#
注:可以关闭keepalived服务,通过cat /var/log/messages观察VIP移动情况。
node01, node02配置操作
# vi /etc/sysctl.conf net.ipv4.ip_nonlocal_bind = 1 # sysctl �Cp
# tar zxvf haproxy-1.4.25.tar.gz # cd haproxy-1.4.25 # make TARGET=linux26 PREFIX=/usr/local/haproxy # make install PREFIX=/usr/local/haproxy # cd /usr/local/haproxy # mkdir conf
# wget http://www.dest-unreach.org/socat/download/socat-2.0.0-b5.tar.gz # tar zxvf socat-2.0.0-b5.tar.gz # ./configure --disable-fips # make && make install
# vi /usr/local/haproxy/conf/haproxy.cfg global log 127.0.0.1 local0 maxconn 65535 chroot /usr/local/haproxy uid 99 gid 99 stats socket /usr/local/haproxy/HaproxSocket level admin daemon nbproc 1 pidfile /usr/local/haproxy/haproxy.pid #debug defaults log 127.0.0.1 local3 mode http option httplog option httpclose option dontlognull option forwardfor option redispatch retries 2 maxconn 2000 balance source #balance roundrobin stats uri /haproxy-stats contimeout 5000 clitimeout 50000 srvtimeout 50000 listen web_proxy 0.0.0.0:80 mode http option httpchk GET /test.html HTTP/1.0\r\nHost:192.168.233.90 server node01 192.168.233.83:8000 weight 3 check inter 2000 rise 2 fall 1 server node02 192.168.233.84:8000 weight 3 backup check inter 2000 rise 2 fall 1 listen stats_auth 0.0.0.0:91 mode http stats enable stats uri /admin stats realm "Admin console" stats auth admin:123456 stats hide-version stats refresh 10s stats admin if TRUE
# vi /usr/local/haproxy/conf/haproxy.cfg global log 127.0.0.1 local0 maxconn 65535 chroot /usr/local/haproxy uid 99 gid 99 stats socket /usr/local/haproxy/HaproxSocket level admin daemon nbproc 1 pidfile /usr/local/haproxy/haproxy.pid #debug defaults log 127.0.0.1 local3 mode http option httplog option httpclose option dontlognull option forwardfor option redispatch retries 2 maxconn 2000 balance source #balance roundrobin stats uri /haproxy-stats contimeout 5000 clitimeout 50000 srvtimeout 50000 listen web_proxy 0.0.0.0:80 mode http option httpchk GET /test.html HTTP/1.0\r\nHost:192.168.233.90 server node01 192.168.233.83:8000 weight 3 backup check inter 2000 rise 2 fall 1 server node02 192.168.233.84:8000 weight 3 check inter 2000 rise 2 fall 1 listen stats_auth 0.0.0.0:91 mode http stats enable stats uri /admin stats realm "Admin_console" stats auth admin:123456 stats hide-version stats refresh 10s stats admin if TRUE
说明:两节点互为主备模式,均优化将本机的节点应用做为主节点,也可以为负载均衡模式。
# vi /etc/syslog.conf local3.* /var/log/haproxy.log local0.* /var/log/haproxy.log *.info;mail.none;authpriv.none;cron.none;local3.none /var/log/messages
说明: 第三行是去掉在/var/log/message再记录haproxy.log日志的功能的。
# vi /etc/sysconfig/syslog SYSLOGD_OPTIONS="-r -m 0"
直接手动执行
service syslog restart touch /var/log/haproxy.log chown nobody:nobody /var/log/haproxy.log 注:99默认是nobody用户 chmod u+x /var/log/haproxy.log
# vi /root/system/cut_log.sh #!/bin/bash # author: koumm # desc: # date: 2014-08-28 # version: v1.0 # modify: # cut haproxy log if [ -e /var/log/haproxy.log ]; then mv /var/log/haproxy.log /var/log/haproxy.log.bak fi if [ -e /var/log/haproxy.log.bak ]; then logrotate -f /etc/logrotate.conf chown nobody:nobody /var/log/haproxy.log chmod +x /var/log/haproxy.log fi sleep 1 if [ -e /var/log/haproxy.log ]; then rm -rf /var/log/haproxy.log.bak fi
注:root权限执行脚本。
# crontab -e
59 23 * * * su - root -c '/root/system/cut_log.sh'
# vi /etc/init.d/haproxy #!/bin/sh # chkconfig: 345 85 15 # description: HAProxy is a TCP/HTTP reverse proxy which is particularly suited for high availability environments. # Source function library. if [ -f /etc/init.d/functions ]; then . /etc/init.d/functions elif [ -f /etc/rc.d/init.d/functions ] ; then . /etc/rc.d/init.d/functions else exit 0 fi # Source networking configuration. . /etc/sysconfig/network # Check that networking is up. [ ${NETWORKING} = "no" ] && exit 0 [ -f /usr/local/haproxy/conf/haproxy.cfg ] || exit 1 RETVAL=0 start() { /usr/local/haproxy/sbin/haproxy -c -q -f /usr/local/haproxy/conf/haproxy.cfg if [ $? -ne 0 ]; then echo "Errors found in configuration file." return 1 fi echo -n "Starting HAproxy: " daemon /usr/local/haproxy/sbin/haproxy -D -f /usr/local/haproxy/conf/haproxy.cfg -p /var/run/haproxy.pid RETVAL=$? echo [ $RETVAL -eq 0 ] && touch /var/lock/subsys/haproxy return $RETVAL } stop() { echo -n "Shutting down HAproxy: " killproc haproxy -USR1 RETVAL=$? echo [ $RETVAL -eq 0 ] && rm -f /var/lock/subsys/haproxy [ $RETVAL -eq 0 ] && rm -f /var/run/haproxy.pid return $RETVAL } restart() { /usr/local/haproxy/sbin/haproxy -c -q -f /usr/local/haproxy/conf/haproxy.cfg if [ $? -ne 0 ]; then echo "Errors found in configuration file, check it with 'haproxy check'." return 1 fi stop start } check() { /usr/local/haproxy/sbin/haproxy -c -q -V -f /usr/local/haproxy/conf/haproxy.cfg } rhstatus() { status haproxy } condrestart() { [ -e /var/lock/subsys/haproxy ] && restart || : } # See how we were called. case "$1" in start) start ;; stop) stop ;; restart) restart ;; reload) restart ;; condrestart) condrestart ;; status) rhstatus ;; check) check ;; *) echo $"Usage: haproxy {start|stop|restart|reload|condrestart|status|check}" RETVAL=1 esac exit $RETVAL
chmod +x /etc/init.d/haproxy chkconfig --add haproxy chkconfig haproxy on service haproxy start
http://192.168.233.85:91/admin
http://192.168.233.83:91/admin
http://192.168.233.84:91/admin
因为没有应用,代理会出现503报错。
在应用程序中配置会话复制
# vi /cluster/zhzxxt/deploy/app.war/WEB-INF/web.xml
直接在<web-app>下加入一行<distributable/>
<!DOCTYPE web-app PUBLIC "-//Sun Microsystems, Inc.//DTD Web Application 2.3//EN" "http://java.sun.com/dtd/web-app_2_3.dtd"> <web-app> <distributable/>
# vi /cluster/jboss4/server/node01/deploy/jboss-web-cluster.sar/META-INF/jboss-service.xml
# vi /cluster/jboss4/server/node02/deploy/jboss-web-cluster.sar/META-INF/jboss-service.xml
<attribute name="ClusterName">Tomcat-APP-Cluster</attribute>
<config> <TCP bind_addr="192.168.233.83" start_port="7810" loopback="true" tcp_nodelay="true" recv_buf_size="20000000" send_buf_size="640000" discard_incompatible_packets="true" enable_bundling="true" max_bundle_size="64000" max_bundle_timeout="30" use_incoming_packet_handler="true" use_outgoing_packet_handler="false" down_thread="false" up_thread="false" use_send_queues="false" sock_conn_timeout="300" skip_suspected_members="true"/> <TCPPING initial_hosts="192.168.233.83[7810],192.168.233.84[7810]" port_range="3" timeout="3000" down_thread="true" up_thread="true" num_initial_members="3"/> <MERGE2 max_interval="100000" down_thread="true" up_thread="true" min_interval="20000"/> <FD_SOCK down_thread="true" up_thread="true"/> <FD timeout="10000" max_tries="5" down_thread="true" up_thread="true" shun="true"/> <VERIFY_SUSPECT timeout="1500" down_thread="true" up_thread="true"/> <pbcast.NAKACK max_xmit_size="60000" use_mcast_xmit="false" gc_lag="0" retransmit_timeout="300,600,1200,2400,4800" down_thread="true" up_thread="true" discard_delivered_msgs="true"/> <pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000" down_thread="false" up_thread="false" <pbcast.GMS print_local_addr="true" join_timeout="3000" down_thread="true" up_thread="true" join_retry_timeout="2000" shun="true" view_bundling="true"/> <FC max_credits="2000000" down_thread="true" up_thread="true" min_threshold="0.10"/> <FRAG2 frag_size="60000" down_thread="true" up_thread="true"/> <pbcast.STATE_TRANSFER down_thread="true" up_thread="true" use_flush="false"/> </config>
整个架构配置完毕,实际在测试过程中稳定可靠。