云上RAC搭建

云上RAC搭建

  • 简述
  • 安装环境
  • 安装遇到的问题
    • 网络组播问题
    • 共享磁盘无法获得scsi_id
    • 安装GI出现磁盘告警
    • 执行root脚本报错
  • 总结

简述

本文对云上搭建RAC环境和常规架构搭建环境做比对,对云上搭建RAC遇到的问题进行分析和解决。本文仅适用于学习交流,并无法应用到生产使用中。

安装环境

操作系统版本:Centos7.2
数据库版本: 11.2.0.4
ECS1 IP 172.102.2.150
ECS2 IP 172.102.2.151
RAC的网络规划

ip 名称
10.10.10.101 rac1
10.10.10.102 rac2
10.10.10.103 rac1-vip
10.10.10.104 rac2-vip
192.168.100.101 rac1-priv
192.168.100.102 rac2-priv
10.10.10.105 scan-ip

安装遇到的问题

网络组播问题

Oracle的rac环境要求公网和私网能够可以组播,而ECS上使用的vpc网络,无法实现组播功能,这就需要使用第三方软件N2N实现同一个VPC下服务器的网络可以组播通信。
安装

wget https://github.com/ntop/n2n/archive/master.zip
unzip master.zip
cd n2n-master/
make
make PREFIX=/opt/n2n install

启动超级节点

在节点1上执行
nohup /opt/n2n/sbin/supernode -l 65530 &

启动虚拟网卡

节点1
/opt/n2n/sbin/edge -d edge0 -a 10.10.10.101 -s 255.255.255.0 -c dtstack -k dtstack -l 172.102.2.150:65530 -E -r
/opt/n2n/sbin/edge -d edge1 -a 192.168.100.101 -s 255.255.255.0 -c dtstack -k dtstack -l 172.102.2.150:65530 -E –r
节点2
/opt/n2n/sbin/edge -d edge0 -a 10.10.10.102 -s 255.255.255.0 -c dtstack -k dtstack -l 172.102.2.150:65530 -E -r
/opt/n2n/sbin/edge -d edge1 -a 192.168.100.102 -s 255.255.255.0 -c dtstack -k dtstack -l 172.102.2.150:65530 -E -r

查看网络

#ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.102.2.150  netmask 255.255.255.0  broadcast 172.102.2.255
        ether 00:16:4f:02:03:2f  txqueuelen 1000  (Ethernet)
        RX packets 9972204  bytes 7370730206 (6.8 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 10323628  bytes 13053626825 (12.1 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
edge0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1400
        inet 10.10.10.101  netmask 255.255.255.0  broadcast 10.10.10.255
        ether b6:f8:b3:29:91:54  txqueuelen 1000  (Ethernet)
        RX packets 3976751  bytes 357822921 (341.2 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 7668656  bytes 10636528983 (9.9 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
edge1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1400
        inet 192.168.100.101  netmask 255.255.255.0  broadcast 192.168.100.255
        ether aa:26:8e:9f:a1:4f  txqueuelen 1000  (Ethernet)
        RX packets 2016215  bytes 1345879442 (1.2 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1587796  bytes 805075807 (767.7 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

如果要删掉或修改虚拟网络,查找杀掉相关进程即可

[grid@rac1 ~]$ ps -ef|grep edg
root     10499     1  0 Jun19 ?        00:15:01 /opt/n2n/sbin/edge -d edge0 -a 10.10.10.101 -s 255.255.255.0 -c dtstack -k dtstack -l 172.102.2.150:65530 -E -r
root     10507     1  0 Jun19 ?        00:04:38 /opt/n2n/sbin/edge -d edge1 -a 192.168.100.101 -s 255.255.255.0 -c dtstack -k dtstack -l 172.102.2.150:65530 -E -r

在后面 runcluvfy.sh 检测网络的日志如下

Checking subnet "172.102.2.0" for multicast communication with multicast group "230.0.1.0"...
PRVG-11134 : Interface "172.102.2.151" on node "rac2" is not able to communicate with interface "172.102.2.150" on node "rac1"
PRVG-11134 : Interface "172.102.2.150" on node "rac1" is not able to communicate with interface "172.102.2.151" on node "rac2"
Checking subnet "172.102.2.0" for multicast communication with multicast group "224.0.0.251"...
PRVG-11134 : Interface "172.102.2.151" on node "rac2" is not able to communicate with interface "172.102.2.150" on node "rac1"
PRVG-11134 : Interface "172.102.2.150" on node "rac1" is not able to communicate with interface "172.102.2.151" on node "rac2"
Checking subnet "10.10.10.0" for multicast communication with multicast group "230.0.1.0"...
Check of subnet "10.10.10.0" for multicast communication with multicast group "230.0.1.0" passed.

Checking subnet "192.168.100.0" for multicast communication with multicast group "230.0.1.0"...
Check of subnet "192.168.100.0" for multicast communication with multicast group "230.0.1.0" passed.

可以看出172.102.2网段还是无法实现multicast,但虚拟出来的网络10.10.10、192.168.100 multicast communication是成功的。

共享磁盘无法获得scsi_id

云上RAC搭建_第1张图片
咨询了阿里的同学
云上RAC搭建_第2张图片
只能通过裸设备直接挂载
用fdisk对磁盘进行分区,过程略
修改udev文件

#vi /etc/udev/rules.d/60-raw.rules
ACTION=="add", KERNEL=="vdb1",RUN+="/bin/raw /dev/raw/raw1 %N"
ACTION=="add", KERNEL=="vdc1",RUN+="/bin/raw /dev/raw/raw2 %N"
ACTION=="add", KERNEL=="raw[1-5]", OWNER="grid", GROUP="asmadmin", MODE="660"
#udevadm control --reload-rules

因为是测试,没有严格用3个盘来创建ORC磁盘组,我只用了vdb1做决策盘。
创建裸设备并配置在启动时加载

#/bin/raw /dev/raw/raw1 /dev/vdb1
#/bin/raw /dev/raw/raw2 /dev/vdc1
vi /etc/rc.d/rc.local 
/bin/raw /dev/raw/raw1 /dev/vdb1
/bin/raw /dev/raw/raw2 /dev/vdc1

安装GI出现磁盘告警

在GI安装的检测过程中会出现共享盘告警,点击detail显示PRVF-5150错误。
PRVF-5150 : Path … is not a valid path on all nodes
遇到这个错误直接忽略报错即可

执行root脚本报错

在安装GI最后要执行脚本,报错如下

Using configuration parameter file: /u01/app/grid/product/11.2.0/grid_1/crs/install/crsconfig_params
User ignored Prerequisites during installation
Installing Trace File Analyzer
Adding Clusterware entries to inittab
ohasd failed to start
Failed to start the Clusterware. Last 20 lines of the alert log follow: 
2019-06-20 09:42:14.653: 
[client(29322)]CRS-2101:The OLR was formatted using version 3.
2019-06-20 09:52:51.519: 
[ohasd(29958)]CRS-0715:Oracle High Availability Service has timed out waiting for init.ohasd to be started.
/u01/app/grid/product/11.2.0/grid_1/perl/bin/perl -I/u01/app/grid/product/11.2.0/grid_1/perl/lib -I/u01/app/grid/product/11.2.0/grid_1/crs/install /u01/app/grid/product/11.2.0/grid_1/crs/install/rootcrs.pl execution failed

解决办法

#nohup /etc/rc.d/init.d/init.ohasd run &
再次执行root.sh
#/u01/app/grid/product/11.2.0/grid_1/root.sh

总结

  • 1.除了网络问题,其他方面与常规环境下的rac安装并没有区别
  • 2.虽然搭建出来,但并不意味着可以在生产环境中使用,
    虚拟网卡里的网速
    云上RAC搭建_第3张图片
    ECS中网卡的网速
    云上RAC搭建_第4张图片

在RAC节点的网速只有10M/S,并且公网私网只是在逻辑上进行了区分,物理上还是在同一个网络中,无论在性能还是在安全性上都无法达到生产环境的标准。

你可能感兴趣的:(云上RAC搭建)