OS、HDD、Package
* CentOS 5.4 32 bit (Linux 2.6.18-164.el5) * 2台
o 每台 HDD *2�w
+ /dev/hda 20GB (System)
+ /dev/hdb 10GB (Database Files)
o Install Package
+ drbd83-8.3.8-1.el5.centos
+ kmod-drbd83-8.3.8-1.el5.centos
+ heartbeat-pils-2.1.3-3.el5.centos
+ heartbeat-ldirectord-2.1.3-3.el5.centos
+ heartbeat-2.1.3-3.el5.centos
+ heartbeat-stonith-2.1.3-3.el5.centos
+ mysql-server-5.0.77-4.el5_5.3
Networking
* Hostname: node1.weithenn.org (Default Primary Node)
o Host IP (eth0): 10.10.25.121/24
o Heartbeat IP (eth1): 192.168.1.1/24
* Hostname: node2.weithenn.org (Default Secondary Node)
o Host IP (eth0): 10.10.25.122/24
o Heartbeat IP (eth1): 192.168.1.2/24
Cluster
* Cluster IP (eth0:0): 10.10.25.120/24 (哪台主�C接手�� Primary Node 後��自�釉O定此�W卡及 IP Address)
* Cluster Reousrce Name: ha
安�b及�O定
Node1 及 Node2 共同�O定
步�E1.�P�] IPTables 及 SELinux
安�b�r�� CentOS 安�b於 hda 而 hdb 先不用理它後�m步�E���M行�理,我��要�楦呖捎眯原h境���浠��A的�O定,�先�二台主�C的防火�� (Iptables) 及 SELinux �P�]以利建置�^程能�利�绦小�(Node1 及 Node2 都必��O定)
#service iptables stop
#sestatus
SELinux status: disabled
步�E2.�O定�W路�Y�
�O定好二台主�C的 IP Address、Netmask、Default Gateway、DNS 及最重要的 Hostnames,�O定完成後�Y�如下(Node1 及 Node2 都必��O定)
* node1.weithenn.org (Default Primary Node)
o Host IP (eth0): 10.10.25.121/24
o Heartbeat IP (eth1): 192.168.1.1/24
* node2.weithenn.org (Default Secondary Node)
o Host IP (eth0): 10.10.25.122/24
o Heartbeat IP (eth1): 192.168.1.2/24
因�楦呖捎眯��於主�C的 FQDN 的�O定非常要求,��⒅�C名�Q加入 /etc/hosts �纫员� DNS �\作出�F���}�r也能保持主�C�\作正常,此�n案在 node1 及 node2 都必��O定,�O定完成後�交互�y�能否使用 FQDN 去 ping 到�Ψ街�C。(Node1 及 Node2 都必��O定)
#cat /etc/hosts
127.0.0.1 localhost
::1 localhost6.localdomain6 localhost6
10.10.25.121 node1.weithenn.org node1
10.10.25.122 node2.weithenn.org node2
步�E3.建立分割�^於 hdb
�� hdb 硬碟建立分割�^,此�w硬碟即���r二台主�C要同步�Y料的硬碟,也是��r我��存放 MySQL �Y料��n案的地方。(Node1 及 Node2 都必�建立)
#fdisk /dev/hdb //���� hdb 建立分割�^
The number of cylinders for this disk is set to 20805.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
(e.g., DOS FDISK, OS/2 FDISK)
Command (m for help): n //�I入 n 表示要建立分割�^
Command action
e extended
p primary partition (1-4)
p //�I入 p 表示建立主要分割�^
Partition number (1-4): 1 //�I入 1 �榇酥饕�分割�^代�
First cylinder (1-20805, default 1): //�_始磁柱值,按下 enter 即可
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-20805, default 20805): //�Y束磁柱值,按下 enter 即可
Using default value 20805
Command (m for help): w //�I入 w 表示�_定�绦��才�O定
The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.
[root@node1 yum.repos.d]# partprobe //使��才的 partition table �更生效
建立分割�^完成後使用指令 fdisk -l �_定 partition talbe ��B
#fdisk -l
Disk /dev/hda: 21.4 GB, 21474754560 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/hda1 * 1 13 104391 83 Linux
/dev/hda2 14 2610 20860402+ 8e Linux LVM
Disk /dev/hdb: 10.7 GB, 10737377280 bytes
16 heads, 63 sectors/track, 20805 cylinders
Units = cylinders of 1008 * 512 = 516096 bytes
Device Boot Start End Blocks Id System
/dev/hdb1 1 20805 10485688+ 83 Linux
建立分割�^完成後�建立 /db �Y料�A,此�Y料�A也就是��r hdb 空�g要�燧d於此�Y料�A。 (Node1 及 Node2 都必�建立)
#mkdir /db
步�E4.安�b相�P套件
使用 [YUM] 安�b DRBD、Heartbeat、MySQL 等相�P套件。(Node1 及 Node2 都必�安�b)
#yum -y install drbd83 kmod-drbd83
#yum -y install heartbeat
#yum -y install heartbeat heartbeat-ldirectord heartbeat-pils heartbeat-stonith
#yum -y install mysql-server
安�b好後修改 mysql �O定�n,�� mysql �Y料�齑娣怕�接深A�O的 /var/lib/mysql 修改至��r���绦杏驳��Y料同步的 /db。(Node1 及 Node2 都必��O定)
#vi /etc/my.cnf
[mysqld]
datadir=/var/lib/mysql //�A�O值
datadir=/db //修改後
socket=/var/lib/mysql/mysql.sock
user=mysql
...略...
步�E5.修改 drbd �O定�n
修改 drbd �O定�n,�O定�n�热葜腥裟�的 CentOS �� 32 bit �t安�b heartbeat 後路��� /usr/lib/heartbeat,若 CentOS �� 64 bit �t heartbeat 安�b後路��� /usr/lib64/heartbeat。(Node1 及 Node2 都必��O定)
#vi /etc/drbd.conf
global {
minor-count 64;
usage-count yes;
}
common {
syncer { rate 1000M; }
}
resource ha {
protocol C;
handlers {
pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
local-io-error "/usr/lib/drbd/notify-local-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
fence-peer "/usr/lib/heartbeat/drbd-peer-outdater -t 5";
pri-lost "/usr/lib/drbd/notify-pri-lost.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
split-brain "/usr/lib/drbd/notify-split-brain.sh root";
out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root";
}
startup {
wfc-timeout 60;
degr-wfc-timeout 120;
outdated-wfc-timeout 2;
}
disk {
on-io-error detach;
fencing resource-only;
}
syncer {
rate 1000M;
}
on node1.weithenn.org {
device /dev/drbd0;
disk /dev/hdb1;
address 192.168.1.1:7788;
meta-disk internal;
}
on node2.weithenn.org {
device /dev/drbd0;
disk /dev/hdb1;
address 192.168.1.2:7788;
meta-disk internal;
}
}
完成後��{整 drbd �绦�n�嘞抟员汜崂m指令能�利�绦校�若此步�E省略的�後�m�绦羞^程��出�F�e�`。(Node1 及 Node2 都必��O定)
#chgrp haclient /sbin/drbdsetup
#chmod o-x /sbin/drbdsetup
#chmod u+s /sbin/drbdsetup
#chgrp haclient /sbin/drbdmeta
#chmod o-x /sbin/drbdmeta
#chmod u+s /sbin/drbdmeta
步�E6.�d入 drbd 模�M�K建立 resource
使用指令 modprobe 指令�磔d入 drbd 模�M,�d入完成後使用指令 drbdadm create-md �斫�立 drbd resource,由於我��在 drbd �O定�n中指令 resource 名�Q�� ha 所以指令便�入 ha。(Node1 及 Node2 都必��O定)
#modprobe drbd //�d入 drbd 模�M
#lsmod|grep drbd //�_�J drbd 模�M是否�d入
drbd 228528 0
#dd if=/dev/zero of=/dev/hdb1 bs=1M count=100 //把一些�Y料塞到 hdb �� (否�t create-md �r有可能��出�F�e�`)
#drbdadm create-md ha //建立 drbd resource
#service drbd start //��� drbd 服��
#chkconfig drbd on //�O定 drbd �_�C�r自����
��油瓿舍峥梢允褂� service drbd status 指令�聿榭茨壳� drbd 的��B,�� node1 ��� drbd 服�斩� node2 尚未���r���l�F��B�� Secondary/Unknown,�� node2 ��� drbd 服�蔗�t���l�F��B�� Secondary/Secondary,而 ds ��B�� Inconsistent 表示二台主�C�Y料尚未同步。
#service drbd status
drbd driver loaded OK; device status:
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by
[email protected], 2010-06-04 08:04:16
m:res cs ro ds p mounted fstype
0:ha Connected Secondary/Secondary Inconsistent/Inconsistent C
�H Node1 �O定
步�E1.初始化二台主�C hdb �Y料
�_定 Node1 及 Node2 主�C都可��y到�Ψ� (Secondary/Secondary) 後我���O定 Node1 �� Primary Node,�K使二台主�C�_始同步 hdb 硬碟�Y料 (��r的 /dev/drbd0),此�r查看 drbd ��B可�l�F同步的百分比及�M度。 (�H Node1 主�C�绦�)
#drbdadm -- --overwrite-data-of-peer primary ha
#service drbd status
drbd driver loaded OK; device status:
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by
[email protected], 2010-06-04 08:04:16
m:res cs ro ds p mounted fstype
... sync'ed: 5.7% (9660/10236)M delay_probe:
0:ha SyncSource Primary/Secondary UpToDate/Inconsistent C
��同步完成後可�l�F二台主�C的 ds ��B都�� UpToDate 表示同步完成,二台主�C�碛邢嗤�且最新的�Y料。
在 Node1 主�C�绦锌吹较铝��B (此台�� Primary Node)
#service drbd status
drbd driver loaded OK; device status:
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by
[email protected], 2010-06-04 08:04:16
m:res cs ro ds p mounted fstype
0:ha Connected Primary/Secondary UpToDate/UpToDate C
在 Node2 主�C�绦锌吹较铝��B (此台�� Secondary Node)
#service drbd status
drbd driver loaded OK; device status:
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by
[email protected], 2010-06-04 08:04:16
m:res cs ro ds p mounted fstype
0:ha Connected Secondary/Primary UpToDate/UpToDate C
步�E2.初始化 MySQL �Y料��
Node1 及 Node2 主�C都完成�Y料同步作�I後 (即 /dev/drbd0 完成初始化) 首先使用指令 mkfs.ext3 指令�砀袷交� /dev/drbd0,格式化完成後�� /dev/drbd0 �燧d至 /db (�H Node1 主�C�绦�)
#mkfs.ext3 /dev/drbd0
mke2fs 1.39 (29-May-2006)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
1310720 inodes, 2621333 blocks
131066 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=2684354560
80 block groups
32768 blocks per group, 32768 fragments per group
16384 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632
Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done
This filesystem will be automatically checked every 30 mounts or
180 days, whichever comes first. Use tune2fs -c or -i to override.
#mount /dev/drbd0 /db
#df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
19G 2.7G 15G 16% /
/dev/hda1 99M 12M 82M 13% /boot
tmpfs 252M 0 252M 0% /dev/shm
/dev/drbd0 9.9G 151M 9.2G 2% /db
�� /dev/drbd0 �利�燧d到 /db 後��� MySQL 服��泶_定�Y料�煜嚓P�n案是否能�利��入 /db �Y料�A�取�(�H Node1 主�C�绦�)
#service mysqld start
#ll /db
total 20544
-rw-rw---- 1 mysql mysql 10485760 Nov 1 14:48 ibdata1
-rw-rw---- 1 mysql mysql 5242880 Nov 1 14:48 ib_logfile0
-rw-rw---- 1 mysql mysql 5242880 Nov 1 14:48 ib_logfile1
drwx------ 2 mysql mysql 16384 Nov 1 14:46 lost+found
drwx------ 2 mysql mysql 4096 Nov 1 14:48 mysql //�A�O�Y料��
drwx------ 2 mysql mysql 4096 Nov 1 14:48 test //�y��Y料��
步�E3.停止 MySQL 服�占靶遁d
�_定 MySQL 服�湛身�利��入 /db 後,便可��湓O定 Heartbeat 部份但在�O定以前�先�� Node1 主�C中 MySQL 服�胀V辜� /dev/drbd0 卸�d�K把 Node1 先退回�� Secondary Node,以避免�O定 Heartbeat �^程中受到影� (�H Node1 主�C�绦�)
#service mysqld stop //停止 mysql 服��
#umount /dev/drbd0 //卸�d /dev/drbd0
#drbdadm secondary ha //在 node1 主�C�绦�⑵浣�� Secondary Node
#service drbd status //�_�J二台主�C都�� Secondary Node
drbd driver loaded OK; device status:
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by
[email protected], 2010-06-04 08:04:16
m:res cs ro ds p mounted fstype
0:ha Connected Secondary/Secondary UpToDate/UpToDate C
Node1 及 Node2 共同�O定
步�E1.�O定 Heartbeat �O定�n - ha.cf
�O定 ha.cf �O定�n其中 Node1 及 Node2 在�O定�n�热莶煌���H ucast 所指定的 IP Address, Node1 必�指定 eth0 �� Node2 的 10.10.25.122 而 eth1 �� 192.168.1.2,而 Node2 �t指定 eth0 �� Node1 的 10.10.25.121 而 eth1 �� 192.168.1.1。(Node1 及 Node2 都必��O定)
#vi /etc/ha.d/ha.cf
debugfile /var/log/ha-debug //heartbeat 除�e���n
logfile /var/log/ha-log //heartbeat ���n
logfacility local0 //���n的��等�
autojoin none
ucast eth0 10.10.25.122 //此�� Node1 主�C的�O定,若是 Node2 主�C��O定�� 10.10.25.121
ucast eth1 192.168.1.2 //此�� Node1 主�C的�O定,若是 Node2 主�C��O定�� 192.168.1.1
ping 10.10.25.254 //IP �B���y用,�O定�^�W�鹊� Gateway (���W路或 Heartbeat 失效�r�y�用)
respawn hacluster /usr/lib/heartbeat/ipfail
respawn hacluster /usr/lib/heartbeat/dopd
apiauth dopd gid=haclient uid=hacluster
udpport 694 //使用 UDP Protocol 及 694 Port �硗ㄓ�
warntime 5 //�W路�l生��r�r告警�r�g 5 秒
deadtime 15 //�W路�嗑���r�l生 15 秒�t判定�W路出���} (Secondary Node ��浣邮� Primary Node 的服��)
initdead 60
keepalive 2 //每 2 秒 Node 互相��y一次 (使用 ping)
node node1.weithenn.org
node node2.weithenn.org
auto_failback off //�� Primary Node �l生���} Secondary Node 接手後若本�淼� Primary Node 修�歪崾欠褚�把目前的 Primary Node ��回
步�E2.�O定 Heartbeat �O定�n - haresources
�O定 haresource �O定�n Node1 及 Node2 �O定�n�热菀荒R�樱�表示�A�O使用 Node1 主�C��任 Primary Node 角色,此�O定�n�热菘煞�槲宥�砜础�(Node1 及 Node2 都必��O定)
1. 指定 Primary Node 的 FQDN 此例�� node1.weithenn.org
2. 指定 Cluster IP Address 此例�� 10.10.25.120
3. 指定 Cluster Resource Name 此例�� drbddisk::ha
4. 指定 Cluster Device、Mount Point、File System Type 此例�� Filesystem::/dev/drbd0::/db::ext3
5. 指定 Service 此例�� mysql
此次��作的 haresources �O定�n�热萑缦�
#vi /etc/ha.d/haresources
node1.weithenn.org 10.10.25.120 drbddisk::ha Filesystem::/dev/drbd0::/db::ext3 mysql
步�E3.�O定 Heartbeat �O定�n - authkeys
此�O定�n�� Cluster Node 之�g的密�a,也就是主�C必�具�浯艘幻艽a�n才��被�J�槭峭�一�� Cluster 中的 Node,此次使用 sha1 ��a方式�K使用 urandom 指令�⒁欢�y���入�n案�犬�作 Cluster 密�a,此步�E在 Node1 主�C�绦型瓿舍嵴�利用 scp 指令��n案�}�u到 Node2 主�C中以便保持密�a�n案一致。
#( echo -ne "auth 1\n1 sha1 "; dd if=/dev/urandom bs=512 count=1 | openssl md5) > /etc/ha.d/authkeys
#cat /etc/ha.d/authkeys
auth 1
1 sha1 71461fc5e160d7846c2f4b524f952128
#chmod 600 /etc/ha.d/authkeys
#scp /etc/ha.d/authkeys node2:/etc/
步�E4.�O定 Heartbeat 服�赵O定�n - mysql
因�轭A�O的 Heartbeat 服�赵O定�n中�A�O�K�]有 mysql 因此我��只好自已���,��r Secondary Node 接手 Primary Node 服��r�K是依��此�n案�M行。(Node1 及 Node2 都必��O定)
#vi /etc/ha.d/resource.d/mysql //�热萑缦�
#!/bin/bash
. /etc/ha.d/shellfuncs
case "$1" in
start)
res=`/etc/init.d/mysqld start`
ret=$?
ha_log $res
exit $ret
;;
stop)
res=`/etc/init.d/mysqld stop`
ret=$?
ha_log $res
exit $ret
;;
status)
if [[ `ps -ef | grep '[m]ysqld'` > 1 ]]; then
echo "running"
else
echo "stopped"
fi
;;
*)
echo "Usage: mysqld {start|stop|status}"
exit 1
;;
esac
exit 0
#chmod 755 /etc/ha.d/resource.d/mysql //�⒋�n案�嘞拊O定�榭�绦�
步�E5.新增 Heartbeat 服��
�� Heartbeat 服�招略鲋料到y中,�注意此服�毡仨�比 DRBD �要慢��臃�t��因�� DRBD 服�丈形��� (/db 尚未�燧d) 而�a生�e�`。 (Node1 及 Node2 都必��O定)
#chkconfig --add heartbeat
#ll /etc/rc3.d/S7*
lrwxrwxrwx 1 root root 14 Nov 1 14:25 /etc/rc3.d/S70drbd -> ../init.d/drbd
lrwxrwxrwx 1 root root 19 Nov 1 15:01 /etc/rc3.d/S75heartbeat -> ../init.d/heartbeat
#chkconfig heartbeat on
步�E6.��� Heartbeat 服��
完成後在 Node1 及 Node2 主�C��� Hertbeat 服�� (Node1 及 Node2 都必��O定)
#service heartbeat start
��二台主�C都��� Heartbeat 服�蔗幔�因�� haresources 的�O定 Node1 主�C��成�� Cluster 中的 Primary Node 因此 Node1 主�C���碛邢铝匈Y源
* �燧d /dev/drbd0 至 /db (Node2 主�C此�r�o法�燧d /dev/drbd0 至 /db)
* ��� mysqld 服��
* �碛� Cluster IP 10.10.25.120 (eth0:0)
�y� HA �C制
在�y� HA �C制�r您可使用下列指令�砑�r (每 1 秒更新) �^察 DRBD 的 Primary/Secondary ��B
watch -n 1 service drbd status
在�y�切�Q Node 以前您可在 Primary Node 建立 database 待 Secondary Node 接手後查看��才建立的�Y料�焓欠褚渤晒��D移�^��
#mysqladmin -u root password 'weithenn' //�O定 MySQL 管理者密�a�� weithenn
#mysqladmin -p create testdb1 //建立名�Q�� testdb1 的�Y料�� (��要求�入 MySQL 管理者密�a)
#mysqlshow -p //�@示�Y料�� (��要求�入 MySQL 管理者密�a)
�y�1.手�忧�Q Node
要手�忧�Q Node 只要在 Primary Node 上�� heartbeat 服�胀V辜纯桑���原�淼� Primary Node 的 heartbeat 服�赵俅��俞�K不���⒅���回 (因�� ha.cf �O定�n中 auto_failback �� off),若您在 ha.cf �O定�n中 auto_failback �� on �t��原�淼� Primary Node (�F在�� Secondary Node) 的 Heartbeat 服�赵俅��俞���⒅���回。
Primary Node �� Node1 主�C,�绦� heartbeat 服�罩匦���
[root@node1 ~]#service heartbeat restart
Stopping High-Availability services:
[ OK ]
Waiting to allow resource takeover to complete:
[ OK ]
Starting High-Availability services:
2010/11/03_11:31:41 INFO: Resource is stopped
[ OK ]
��上述指令�绦�r您��看到 Node1 主�C中 DRBD ��B�化�� Primary/Secondary >> Secondary/Secondary >> Secondary/Primary,��上述指令�绦型瓿舍��B如下
[root@node1 ~]#service drbd status
drbd driver loaded OK; device status:
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by
[email protected], 2010-06-04 08:04:16
m:res cs ro ds p mounted fstype
0:ha Connected Secondary/Primary UpToDate/UpToDate C
此�r在 Node2 主�C您可看到接手 Primary Node 角色 (�燧d /dev/drbd0 至 /db、接手 mysqld 服�铡��� Cluster IP 10.10.25.120 及�W卡 eth0:0)
[root@node2 ~]#service drbd status //接手 /dev/drbd0
drbd driver loaded OK; device status:
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by
[email protected], 2010-06-04 08:04:16
m:res cs ro ds p mounted fstype
0:ha Connected Primary/Secondary UpToDate/UpToDate C /db ext3
[root@node2 ~]# service mysqld status //接手 mysqld 服��
mysqld (pid 10101) is running...
[root@node2 ~]# ifconfig eth0:0 //接手 Cluster IP
eth0:0 Link encap:Ethernet HWaddr 00:03:FF:1E:7B:82
inet addr:10.10.25.120 Bcast:10.10.25.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:11 Base address:0xc000
�y�2.��主�C�W路�嗑�
模�M���W路�嗑��r是否��自�忧�Q Node,您可�⒉逶� eth0 �W路�拔掉或在 Primary Node 上�绦� ifdown eth0 指令去�P�] eth0 �W卡�磉M行 Node 切�Q�y�。
1. 拔除 eth0 上的�W路�或�绦� ifdown eth0 指令�黻P�]�W卡�B�
2. Primary Node 切�Q�� Secondary Node,而原 Secondary Node 接手�� Primary Node
�y�3.���W路卡及交�Q器�p��
若您有多片�W卡�r可�O定�W卡 Bonding (NIC Teaming) 功能�K接在不同的���w�W路交�Q器上,如此一�砑纯深A防�W路卡�p�幕蚓W路交�Q器�p�亩�造成的�嗑����}。
正�_的�_�C/�P�C流程
�C器�\作日子一久�y免��遇到�C器�p�幕蚱渌�因素而需要停�C,��然�翁ㄍ�C�r我��可利用停止 Heartbeat 服��磉_成切�Q Node 的目的�M而�S修硬�w,若遇到特殊情�r需要�⒍�台主�C都停�C的情�r下正�_的�_�C/�P�C流程是必要的,以下便是�f明正�_的�_�C/�P�C流程。
正�_�_�C流程
��二台 Node 主�C都�P�C而�A�O Node1 �� Primary Node 的情�r下,正�_的�_�C流程�椋�
1. Power On Node1 主�C
2. �^�s 1 分�後再 Power On Node2 主�C
3. �� Node1 主�C�_�C流程�\作到 DRBD 程序�r��等待 60 秒 (等待 Secondary Node 回��),在�得肫陂g Node2 主�C�_�C流程也�\作到 DRBD 程序�r��去找 Node1 主�C�M行�贤�
4. Node1 �利�_�C完成,Node2 也�_�C完成
5. 登入 Node1、Node2 主�C查看 DRBD ��B及相�P�Y� (�燧d /dev/drbd0 至 /db、接手 mysqld 服�铡��� Cluster IP 10.10.25.120 及�W卡 eth0:0)
正�_�P�C流程
��遇到不可抗力因素必�要二台主�C都�P�C且 Node1 �� Primary Node 的情�r下,正�_的�P�C流程�椋�
1. 在 Node1 主�C�绦� service heartbeat stop 指令�� Heartbeat 服�贞P�]
2. �_�J Node2 主�C接手成�� Primary Node 且相�P服�者\作正常
3. �� Node1 主�C�P�C (此�r在 Node2 主�C看到的 DRBD ��B�� Primary/Unknown)
4. 在 Node2 主�C上�绦� service mysqld stop 指令�� MySQL 服�贞P�]
5. MySQL 服�贞P�]完成後即可�� Node2 主�C�P�C
6. 之後�_�C�r再遵照上述的 正�_�_�C流程 步�E即可�利� Cluster 再度�\作
移�C�O定流程
因�槌霈F不同的�O定需求 (加�b�W卡所以要�O定�W卡�� Bonding 模式、更�Q IP Address、伺服器要�Q地方),因此除了上述的正常�_�C/�P�C流程是不�虻模��便��一下整��流程:
1. 先�� Node1、Node2 �_�C��� drbd、heartbeat 服�贞P�]
2. 於 Node1 主�C�� heartbeat 服�胀V梗��_�J Node2 主�C接手 HA 服�蔗峒纯�� Node1 主�C�P�C
3. 於 Node2 主�C�� mysqld 服�胀V贯峒纯申P�C
4. �� Node1、Node2 退出�C��後搬移至新地�c,上架前先安�b�U充的�W卡此�r因�殚_�C�]有��� HA 服�账�以�]有�_�C�序的���}
5. �_�C完成後�O定 Node1、Node2 的�W卡 Bonding 模式及修改 IP Address �K�y�容�e�C制是否�\作
6. 修改 drbd、heartbeat 相�P�O定�n�热� (IP Address)
7. �� Node1、Node2 �_�C��� drbd、heartbeat 服��⒂冕嵯�� Node1 重新�_�C�s 1 分�後再�� Node2 重新�_�C
8. �_�C完成後�_定 HA �C制是否�\作�K�y� HA �C制
9. 完成改�O定及移�C要求
�⒖�
[DRBD - Software Development for High Availability Clusters]
[The DRBD User's Guide]
[The Linux-HA User's Guide]
Me FAQ
Q1.'ha' ignored, since this host (node2.weithenn.org) is not mentioned with an 'on' keyword.?
Error Meaage:
�绦兄噶� drbdadm create-md ha �r出�F如下�e�`�息
'ha' ignored, since this host (node2.weithenn.org) is not mentioned with an 'on' keyword.
Ans:
因�樵� drbd �O定�n drbd.conf 中 on 本���的是 node1、node2 而以,�⒃O定�n�热莞�� FQDN 名�Q node1.weithenn.org、node2.weithenn.org 後即可正常�绦兄噶睢�
Q2.drbdadm create-md ha: exited with code 20?
Error Meaage:
�绦兄噶� drbdadm create-md ha �r出�F如下�e�`�息
open(/dev/hdb1) failed: No such file or directory
Command 'drbdmeta 0 v08 /dev/hdb1 internal create-md' terminated with exit code 20
drbdadm create-md ha: exited with code 20
Ans:
因�橥�了�绦� fdisk /dev/hdb 指令建立分割�^所造成,如下�� /dev/hdb 建立分割�^後指令即可正常�绦�
#fdisk /dev/hdb //���� hdb 建立分割�^
The number of cylinders for this disk is set to 20805.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
(e.g., DOS FDISK, OS/2 FDISK)
Command (m for help): n //�I入 n 表示要建立分割�^
Command action
e extended
p primary partition (1-4)
p //�I入 p 表示建立主要分割�^
Partition number (1-4): 1 //�I入 1 �榇酥饕�分割�^代�
First cylinder (1-20805, default 1): //�_始磁柱值,按下 enter 即可
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-20805, default 20805): //�Y束磁柱值,按下 enter 即可
Using default value 20805
Command (m for help): w //�I入 w 表示�_定�绦��才�O定
The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.
[root@node1 yum.repos.d]# partprobe //使��才的 partition table �更生效
Q3.drbdadm create-md ha: exited with code 40?
Error Meaage:
�绦兄噶� drbdadm create-md ha �r出�F如下�e�`�息
Device size would be truncated, which
would corrupt data and result in
'access beyond end of device' errors.
You need to either
* use external meta data (recommended)
* shrink that filesystem first
* zero out the device (destroy the filesystem)
Operation refused.
Command 'drbdmeta 0 v08 /dev/hdb1 internal create-md' terminated with exit code 40
drbdadm create-md ha: exited with code 40
Ans:
使用 dd 指令�⒁恍┵Y料塞到 /dev/hdb 後再�绦� drbdadm create-md ha 指令即可�利�绦�
#dd if=/dev/zero of=/dev/hdb1 bs=1M count=100
Q1.DRBD ��B始�K是 Secondary/Unknown?
Error Meaage:
Node1、Node2 主�C��� DRBD 後��B始�K是 Secondary/Unknown
#service drbd status
drbd driver loaded OK; device status:
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by
[email protected], 2010-06-04 08:04:16
m:res cs ro ds p mounted fstype
0:ha WFConnection Secondary/Unknown Inconsistent/DUnknown C