Oracle RAC删除节点,添加节点,真实测试

=======================================开始 删除节点====================================
说明: 我的环境8-10步骤都没有自信,直接格式化的系统,然后添加的节点,8-10步骤是网上摘抄别人的步骤^_^
--1. 更改服务的PERFER为节点一
/u01/app/11.2.0/grid/bin/srvctl modify  service -d rac -n -i rac2 -s websrv
                         
--2. 查看是否修改成功
/u01/app/11.2.0/grid/bin/srvctl config service -d rac -s websrv

--3. 在保留节点上执行:
alter database disable thread 1;

--4. 保留节点上执行,停止被删除节点的IP
srvctl status listener
srvctl disable listener -n host2
srvctl stop listener -n host2
/u01/app/11.2.0/grid/bin/srvctl remove vip -i host1 -f

--5. 关掉要删除的节点,并删除配置,最后验证下是否删除了
/u01/app/11.2.0/grid/bin/srvctl stop instance -d rac -i rac1 -f  
/u01/app/11.2.0/grid/bin/srvctl remove instance -d rac -i rac1
/u01/app/11.2.0/grid/bin/srvctl config database -d rac 

--6. grid用户下执行,更新节点信息
/u01/app/11.2.0/grid/oui/bin/runInstaller -updateNodeList ORACLE_HOME=/u01/app/11.2.0/grid CLUSTER_NODES=host2 CRS=TRUE -silent -local

--7. 在oracle用户下执行同样的步骤:
/u01/app/oracle/product/11.2.0/dbhome_1/oui/bin/runInstaller -updateNodeList ORACLE_HOME=/u01/app/oracle/product/11.2.0/dbhome_1/ CLUSTER_NODES=host2 -silent -local

--8. 在保留节点上执行instance删除
dbca -silent -deleteInstance -nodeList host2 -gdbName rac -instanceName rac2 -sysDBAUserName sys -sysDBAPassword oracle

--9. 被删除的节点上执行,卸载oracle db,删除oracle home
$ORACLE_HOME/deinstall/deinstall -local

--10. Grid Infrastructure层面删除节点(删除grid home),确认节点状态是否是Unpinned
su - grid
$ORACLE_HOME/olsnodes -s -t
----如果是pinned,请设为Unpinned
crsctl unpin css -n host2-priv

--11.在被删除节点禁用clusterware的application and daemons,被删除节点上执行
su - root
cd /u01/app/11.2.0/grid/crs/install
./rootcrs.pl -deconfig -force
=======================================结束 删除节点=======================================

 

=======================================开始 添加节点=======================================
--1. /etc/host 添加 跟主节点一样

--2. 这个步骤可以忽略
添加到自启动/etc/init.d/目录下 ,service acfsload does not support chkconfig =>加上#的三行
cat /etc/init.d/acfsload
#!/bin/sh
#chkconfig: 2345 10 90 
#description: acfsload
/u01/app/11.2.0/grid/bin/acfsload start -s

chkconfig --add acfsload

--3. 添加节点
ssh 互通性
ssh rac1 date
ssh rac2 date
ssh-keygen -t rsa
ssh-keygen -t dsa
ssh host2 cat ~/.ssh/id_rsa.pub >> /home/grid/.ssh/authorized_keys
ssh host1 cat ~/.ssh/id_rsa.pub >> /home/grid/.ssh/authorized_keys
ssh host2 cat ~/.ssh/id_dsa.pub >> /home/grid/.ssh/authorized_keys
ssh host1 cat ~/.ssh/id_dsa.pub >> /home/grid/.ssh/authorized_keys

--4. 11.2 GI中推出了octssd时间同步服务,使用octssd的话那么建议禁用ntpd事件服务
service ntpd stop
chkconfig ntpd off
mv /etc/ntp.conf /etc/ntp.conf.bak
rm /var/run/ntpd.pid

----4.1添加节点的vip,并启动VIP
/u01/app/11.2.0/grid/bin/srvctl add vip -n host1 -A host1-vip/255.255.255.0/eth0 -k 1
/u01/app/11.2.0/grid/bin/crsctl start resource ora.host1.vip
/u01/app/11.2.0/grid/bin/srvctl enable listener -n host1
/u01/app/11.2.0/grid/bin/srvctl start listener -n host1
/u01/app/11.2.0/grid/bin/srvctl status listener -n host1

--5. 查看主节点的asm盘在哪里,验证准备添加的节点时候存在,这里挂在的是裸设备
1). 把现有节点的/etc/udev/rules.d/60-raw.rules文件内容拷贝到准备添加的节点
2). start_udev

--6. 验证新添加的节点是否满足cluster要求:在节点一或节点二运行
cluvfy stage -pre nodeadd -n host1 -fixup

--7. 切换到grid用户下,添加节点:
$ORACLE_HOME/oui/bin/addNode.sh "CLUSTER_NEW_NODES={host1}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={host1-vip}" "CLUSTER_NEW_PRIVATE_NODE_NAMES={host1-priv}"
/u01/app/11.2.0/grid/root.sh

--8. 节点一上执行:
cluvfy stage -pre nodeadd -n host2 -fixup

--9. 节点一上执行 切换到oracle 用户执行,添加节点:
$ORACLE_HOME/oui/bin/addNode.sh "CLUSTER_NEW_NODES={host1}"
$ORACLE_HOME/bin/dbca -silent -addInstance -nodeList host1 -gdbName rac -instanceName rac1 -sysDBAUserName sys -sysDBAPassword oracle
/u01/app/oracle/product/11.2.0/dbhome_1/root.sh

--10. 添加实例到集群配置
/u01/app/11.2.0/grid/bin/srvctl add instance -d rac -i rac1 -n host1

--11. 在现有节点执行,修改local_listener,开启thread
alter system set local_listener='(ADDRESS=(PROTOCOL=TCP)(HOST=10.163.84.18)(PORT=1521))' scope=both sid='rac1';
alter database enable thread 1;

--12. 添加服务自动漂移的节点
/u01/app/11.2.0/grid/bin/srvctl modify  service -d rac -n -i rac1 -a rac2 -s websrv

--13. 关闭一个节点,验证服务时候漂移到另外个节点
/u01/app/11.2.0/grid/bin/srvctl stop  instance -d rac -i rac2 -f
/u01/app/11.2.0/grid/bin/crsctl stat res -t

=======================================结束 添加节点=======================================

 

====================================开始 操作过程中遇到的==================================

--问题:PRVF-4007 : User equivalence check failed for user "grid"
解决:
ssh rac1 date
ssh rac2 date

--问题:Unable to obtain VIP information from node "host1"
解决:现存节点,添加vip服务
备注:前两句话不需要,之前是同事OS层面误删除虚拟网卡,手动启动
--/u01/app/11.2.0/grid/bin/crsctl start res ora.host2.vip
--/u01/app/11.2.0/grid/bin/crsctl start res ora.scan1.vip
/u01/app/11.2.0/grid/bin/crsctl start res ora.host1.vip

--问题:Disk Group CRS creation failed with the following message:
ORA-15018: diskgroup cannot be created
ORA-15031: disk specification '/dev/raw/raw3' matches no disks
ORA-15031: disk specification '/dev/raw/raw2' matches no disks
ORA-15031: disk specification '/dev/raw/raw1' matches no disks
解决:
1. 把现有节点的/etc/udev/rules.d/60-raw.rules文件内容拷贝到准备添加的节点
2. start_udev
3. 执行root脚本 /u01/app/11.2.0/grid/root.sh
cd /u01/app/11.2.0/grid/crs/install/
./rootcrs.pl -deconfig -force
/u01/app/11.2.0/grid/bin/srvctl enable diskgroup -g ARCH -n host1
/u01/app/11.2.0/grid/bin/srvctl status diskgroup -g ARCH -n host1

--问题:ERROR at line 1:
ORA-01565: error in identifying file '+DATA/rac/spfilerac.ora'
ORA-17503: ksfdopn:2 Failed to open file +DATA/rac/spfilerac.ora
ORA-15001: diskgroup "DATA" does not exist or is not mounted
ORA-15040: diskgroup is incomplete
ORA-15040: diskgroup is incomplete

解决:Oracle binary should have permission of 6751. To correct the permission, as owner of oracle binary:

1. cd $GRID_HOME/bin #到$GRID_HOME/bin下
   chmod 6751 oracle
2. cd $ORACLE_HOME/bin
   chmod 6751 oracle

====================================结束 操作过程中遇到的==================================

你可能感兴趣的:(Oracle日常维护)