随風

高可用集群之Corosync+Pacemaker及用CRM命令和NFS-server构建一个HA高可用集群

红帽5.0使用的是OpenAIS作为内核中的信息通信API，然后借助CMAN作为Messager Layer,再使用ramanager作为CRM进行资源的管理

Corosync具有比heartbeat在设计上更好的信息通信机制

红帽6.0直接使用Corosync用来作为集群的Messager Layer

不同的公司的API机制所调用的库，函数类型，返回方式各不相同，这就必须使用一个标准，使不同公司的API保持最大的兼容

比如你买了华硕的主板使用其他公司的鼠标照样可以使用

应用接口规范（AIS）就是用来定义应用程序接口（API）的开放性规范的集合，这些应用程序作为中间件作为应用服务提供了一种开放，高移植性的程序接口，使用AIS的应用程序接口API，减少了应用程序的复杂性和开放时间

OpenAIS组件：CLM CKPT EVT LCK MSG......

OpenAIS的版本：Picacho Whitetank Wilson 其中Wilson是最新的

Corosync是OpenAIS发展到Wilson版本后独立出来的开放性集群引擎工程

OpenAIS从0.9开始分为wilson和Corosync

Corosync本身只是一个集群引擎，用来处理集群的事物信息传递，也就是用来作为Mssager Layer，而Corosync并不具备集群资源的管理功能，其CRM必须有pacemaker扮演提供资源管理pacemaker是由heartbeat V3独立出去的项目,并且Pacemaker独立后的开发着重点也是Corosync而不是heartbeat V3

Corosync可以完全使用命令来进行集群资源的配置，但也有许多图形化工具

corosync是高可用集群的底层信息传递层，主要负责与上层交互并完成心跳和上层所要发送的事务信息。还有，为了防止发生Split brain以后所带来的问题，还有法定票数（quorum）这一概念。这里所要安装的是1.4版本的，负责集群票数的统计，每个节点一张票，到了2.*版本以后有了投票的功能，可以设定某节点可以持有多少张票。最后完成票数的统计并交于CRM层来决策节点集群是否还要运行。更多概念朋友们自己去查吧，我自己对这方面了解的也少。而且我打字真的很慢。
pacemaker是高可用集群中的CRM（Cluster Resource Manager)资源管理层，它是一个服务，可以做为一个单独的服务启动，不过在我们使用corosync-1.4的版本中，可以设置为corosync来启动pacemaker.
pacemaker的配置接口可以在任意节点上安装crmsh或者pcs还有一些GUI界面的软件来完成。crmsh好像在RrdHat6.4以后都不是官方自带的了，官方的是pcs。而crmsh好像是OpenSUSE所开发的。

Corosync的官网www.corosync.org

OPenAIS的官网www.openais.org

Pacemaker官网www.clusterlabs.org

所以集群的Messager Layer与CRM 组合如下：

1 haresource + heartbeat v1/v2

2 crm + heartbeat v2

3 pacemaker + corosync

4 pacemaker + heartbeat v3

5 cman + ragmanager

今天将使用Pacemaker + Corosync用来定义并管理一个集群服务

可以用rpm装也可以进行源码编译,也可以用yum直接装

________________________________________________________________________________________________________

192.168.139.2

[root@www ~]# ntpdate cn.ntp.org.cn \\ntp同步时间，我找的是中国区的一个全球ntp-server

[root@www .ssh]# ssh-keygen -t rsa -P '' //做ssh双机互信

[root@www .ssh]# ssh-copy-id -i ./id_rsa.pub [email protected]

[root@www html]# uname -n \\本节点名称

www.rs1.com

[root@www mysql]# yum install corosync pacemaker \\直接yum安装

________________________________________________________________________________________________________

192.168.139.4

[root@www ~]# ntpdate cn.ntp.org.cn

[root@www .ssh]# ssh-keygen -t rsa -P ''

[root@www .ssh]# ssh-copy-id -i ./id_rsa.pub [email protected]

[root@www html]# uname -n

www.rs2.com

[root@www mysql]# yum install corosync pacemaker

Installed:

corosync.x86_64 0:1.4.7-5.el6 pacemaker.x86_64 0:1.1.14-8.el6_8.1

Dependency Installed:

clusterlib.x86_64 0:3.0.12.1-78.el6 corosynclib.x86_64 0:1.4.7-5.el6 libibverbs.x86_64 0:1.1.8-4.el6 libqb.x86_64 0:0.17.1-2.el6 librdmacm.x86_64 0:1.0.21-0.el6 lm_sensors-libs.x86_64 0:3.1.1-17.el6

net-snmp-libs.x86_64 1:5.5-57.el6_8.1 pacemaker-cli.x86_64 0:1.1.14-8.el6_8.1

pacemaker-cluster-libs.x86_64 0:1.1.14-8.el6_8.1

pacemaker-libs.x86_64 0:1.1.14-8.el6_8.1 pciutils.x86_64 0:3.1.10-4.el6 rdma.noarch 0:6.8_4.1-1.el6

[root@www mysql]# rpm -ql corosync

/etc/corosync //此目录下有Corosync的配置文件

/etc/corosync/corosync.conf.example //Corosync的配置文件样例

/usr/sbin/corosync-keygen //可以用此命令生成秘钥

[root@www mysql]# cd /etc/corosync

[root@www corosync]# ll

total 16

-rw-r--r--. 1 root root 2663 May 11 2016 corosync.conf.example

[root@www corosync]# cp corosync.conf.example corosync.conf

[root@www corosync]# vim corosync.conf

# Please read the corosync.conf.5 manual page

compatibility: whitetank

totem {

version: 2 //配置文件版本号

secauth: off //开启安全认证功能，安全的认证，当使用aisexec时，会非常消耗CPU

threads: 0 //线程数，根据CPU个数和核心数确定，secauth为off时无意义

interface {

ringnumber: 0 //冗余环号，防止多播环路定义每个节点的环号，每个节点 //一个网卡就不用指，默认为0

bindnetaddr: 192.168.139.0 //网卡的网络地址不是IP地址

mcastaddr: 239.255.1.1 //心跳信息传递的组播地址

mcastport: 5405 //组播使用的端口

ttl: 1 //

}

logging {

fileline: off //指定要打印的行

to_stderr: no //错误信息的是否发到标准错误前段，建议不开启

to_logfile: yes //定义是否记录到日志文件

logfile: /var/log/cluster/corosync.log //定义独立日志文件的位置，此目录要自己创 //建

to_syslog: no //定义是否记录到syslog，和to_logfile只启用一个即可

debug: off //是否开启debug功能

timestamp: on //是否打印时间戳，利于错误定位，但每次记录都要通过系统调用获取时 //间，消耗CPU

logger_subsys {

subsys: AMF //是否记录AMF子系统的信息，没有启用OpenAIS,则不用启用

debug: off

}

amf {

mode: disabled //与编程相关的，可以不设置

}

server {

ver: 0

name: pacemaker //启动pacemaker

}

aisexec { //这项可以不用加

user: root

group: root

}

___________________________________________________________________________________________

[root@www ~]# corosync-keygen //生成通信密钥，并保存在/etc/corosync/authkey

Writing corosync key to /etc/corosync/authkey

[root@www cluster]# corosync-keygen //

由于要使用/dev/random生成随机数，因此如果新装的系统操作不多，如果没有足够的熵，可能会出现如下的提示.................... 一定要在本地乱敲键盘,ssh登录的好像没有用
Gathering 1024 bits for key from //random.
Press keys on your keyboard to generate entropy.
Press keys on your keyboard to generate entropy (bits = 240).

[root@www ~]# cd /etc/corosync/

[root@www cluster]#scp /etc/corosync/corosync.conf 192.168.139.2:/etc/corosync/ //将文件复制到另一个节点

[root@www ~]# service corosync start //开启本节点的corosync

[root@www ~]# ssh 192.168.139.2 service corosync start //开启另一个节点的corosync

__________________________________________________________________________________________

//看启动中是否出现错误,网上搜了也不知道为啥，但我仍然顺利完成了整个实验，看来不是什么大错误

[root@www cluster]# grep ERROR: /var/log/cluster/corosync.log

Nov 11 15:05:10 www corosync[3470]: [pcmk ] ERROR: process_ais_conf: You have configured a cluster using the Pacemaker plugin for Corosync. The plugin is not supported in this environment and will be removed very soon.

Nov 11 15:05:10 www corosync[3470]: [pcmk ] ERROR: process_ais_conf: Please see Chapter 8 of 'Clusters from Scratch' (http://www.clusterlabs.org/doc) for details on using Pacemaker with CMAN

__________________________________________________________________________________________

[root@www ~]# grep -e "Corosync Cluster Engine" -e "configuration file" /var/log/cluster/corosync.log //查看corosync引擎是否启动正常

Nov 11 16:34:19 corosync [MAIN ] Corosync Cluster Engine ('1.4.7'): started and ready to provide service.

Nov 11 16:34:19 corosync [MAIN ] Successfully read main configuration file '/etc/corosync/corosync.conf'.

Nov 11 16:34:19 [1908] www.rs2.com cib: info: retrieveCib:Reading cluster configuration file /var/lib/pacemaker/cib/cib.xml (digest: /var/lib/pacemaker/cib/cib.xml.sig)

Nov 11 16:34:19 [1908] www.rs2.com cib: info: cib_file_write_with_digest:Reading cluster configuration file /var/lib/pacemaker/cib/cib.DU5D4x (digest: /var/lib/pacemaker/cib/cib.zBJmL2)

__________________________________________________________________________________________

[root@www ~]# grep TOTEM /var/log/cluster/corosync.log //查看初始化成员节点通知是否正常

Nov 11 16:34:07 corosync [TOTEM ] Initializing transport (UDP/IP Multicast).

Nov 11 16:34:07 corosync [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).

Nov 11 16:34:08 corosync [TOTEM ] The network interface [192.168.139.4] is now up.

Nov 11 16:34:08 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.

__________________________________________________________________________________________

[root@www ~]# grep error /var/log/cluster/corosync.log //看启动中是否出现错误.主要是没有 //配置STONISH设备，可以忽略的错误,最后用crm命令 prorerty stonith-enabled=false 便可禁用

Nov 11 16:34:32 [2174] www.rs2.com pengine: error: unpack_resources:Resource start-up disabled since no STONITH resources have been defined

Nov 11 16:34:32 [2174] www.rs2.com pengine: error: unpack_resources:Either configure some or disable STONITH with the stonith-enabled option

Nov 11 16:34:32 [2174] www.rs2.com pengine: error: unpack_resources:NOTE: Clusters with shared data need STONITH to ensure data integrity

___________________________________________________________________________________________

[root@www ~]# grep pcmk_startup /var/log/cluster/corosync.log //查看pacemaker是否正常 //启动

Nov 11 16:34:08 corosync [pcmk ] info: pcmk_startup: CRM: Initialized

Nov 11 16:34:08 corosync [pcmk ] Logging: Initialized pcmk_startup

Nov 11 16:34:08 corosync [pcmk ] info: pcmk_startup: Maximum core file size is: 18446744073709551615

Nov 11 16:34:08 corosync [pcmk ] info: pcmk_startup: Service: 9

Nov 11 16:34:08 corosync [pcmk ] info: pcmk_startup: Local hostname:www.rs2.com

___________________________________________________________________________________________

[root@www ~]# crm_mon \\可以用来监控集群的当前状态

Last updated: Fri Nov 11 16:19:10 2016 Last change: Fri Nov 11 16:10:18 2016 by hacluster via crmd on www.rs2.com

Stack: classic openais (with plugin)

Current DC: www.rs2.com (version 1.1.14-8.el6_8.1-70404b0) - partition WITHOUT quorum

2 nodes and 0 resources configured, 2 expected votes

//两个节点，0个资源，但不知道为什么rs1 为UNCLEAN (offline)

Node www.rs1.com: UNCLEAN (offline)

Online: [ www.rs2.com ]

//将一切停掉，重新生成了一个corosync配置文件后再此启动又变好了

[root@www .ssh]# crm_mon

Last updated: Fri Oct 28 21:29:51 2016 Last change: Fri Nov 11 22:33:32 2016 by hacluster via crmd on www.rs1.com

Stack: classic openais (with plugin)

Current DC: www.rs1.com (version 1.1.14-8.el6_8.1-70404b0) - partition with quorum

2 nodes and 0 resources configured, 2 expected votes

Online: [ www.rs1.com www.rs2.com ] //两个节点正常

__________________________________________________________________________________________

用crm命令配置集群的资源

[root@www ~]# crm

-bash: crm: command not found

[root@www ~]# rpm -qa pacemaker //pacemaker为1.1.14

pacemaker-1.1.14-8.el6_8.1.x86_64

从pacemaker 1.1.8开始，crm发展成了一个独立项目，叫crmsh。也就是说，我们安装了pacemaker后，并没有crm这个命令，我们要实现对集群资源管理，还需要独立安装crmsh。crmsh的rpm安装可从如下地址下载：

http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-6/x86_64/

crmsh依赖于许多包如：pssh，因此也需要通过上面地址下载pssh.rpm 上面链接还可以下载corosync和pacemaker但我用的是yum直接装的

http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-6/x86_64/

https://build.opensuse.org/package/binary/network:ha-clustering:Stable/crmsh?arch=x86_64&filename=crmsh-2.3.2-1.1.noarch.rpm&repository=RedHat_RHEL-6

或者直接下载openSUSE的ha集群yum源直接安装

[root@www tool]# wget http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-6/network:ha-clustering:Stable.repo

就一个yum库：

[network_ha-clustering_Stable]
name=Stable High Availability/Clustering packages (CentOS_CentOS-6)
type=rpm-md
baseurl=http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-6/
gpgcheck=1
gpgkey=http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-6//repodata/repomd.xml.key
enabled=1

[root@www tool]# mv network\:ha-clustering\:Stable.repo /etc/yum.repos.d/

[root@www yum.repos.d]# ll //这是我主机上的所有yum源

total 52

-rw-r--r--. 1 root root CentOS-Base.repo

-rw-r--r--. 1 root root CentOS-Debuginfo.repo

-rw-r--r--. 1 root root 2015 CentOS-fasttrack.repo

-rw-r--r--. 1 root root 2015 CentOS-Media.repo

-rw-r--r--. 1 root root 2015 CentOS-Vault.repo

-rw-r--r--. 1 root root 2014 elrepo.repo

-rw-r--r--. 1 root root 2012 epel.repo

-rw-r--r--. 1 root roo 2012 epel-testing.repo

-rw-r--r--. 1 root root network:ha-clustering:Stable.repo

-rw-r--r--. 1 root root openSUSE-13.2-NonFree-Update.repo.back

-rw-r--r--. 1 root root openSUSE-Leap-42.1-Update.repo.bak

-rw-r--r--. 1 root root zxl.repo

[root@www tool]# yum install crmsh //直接yum安装

http://www.111cn.net/sys/linux/73074.htm 网上找到的很详细的一篇关于crm命令使用

[root@www tool]# crm

crm(live)# help //获取帮助

cib //cib管理模块

resource //资源管理模块

configure //crm配置，包括资源的粘性，资源的类型，资源的约束等

node //集群节点管理子命令

options //用户优先级

history //crm命令的历史

site //地理集群支持

ra //管理资源代理

status //查看集群的状态

help，？ //查看帮助

end.cd.up //返回上一级

quit,bye,exit //退出crm

crm(live)# cd resource

crm(live)resource# help

.........................

crm(live)resource# cd

crm(live)# configure //进入配置模式

crm(live)configure# show //查看集群的当前配置

node www.rs1.com

node www.rs2.com

property cib-bootstrap-options: \

dc-version=1.1.14-8.el6_8.1-70404b0 \

cluster-infrastructure="classic openais (with plugin)" \

expected-quorum-votes=2

crm(live)configure# verify //查看配置语法，因为没有安装STONITH设备，所以报错

ERROR: error: unpack_resources:Resource start-up disabled since no STONITH resources have been defined

error: unpack_resources:Either configure some or disable STONITH with the stonith-enabled option

error: unpack_resources:NOTE: Clusters with shared data need STONITH to ensure data integrity

Errors found during check: config not valid

crm(live)configure# property stonith-enabled=false //禁用STONISH设备

crm(live)configure# show

node www.rs1.com

node www.rs2.com

property cib-bootstrap-options: \

dc-version=1.1.14-8.el6_8.1-70404b0 \

cluster-infrastructure="classic openais (with plugin)" \

expected-quorum-votes=2 \

stonith-enabled=flase

crm(live)configure# verify //继续检查，不再报错误

crm(live)configure# commit //提交让配置生效

crm(live)configure# cd

crm(live)# ra

crm(live)ra# help

Resource Agents (RA) lists and documentation

Commands:

classes //查看RA类型和提供商

info //查看RA的详细信息

list //查看某一个类别下某个提供商所提供的所有RA

providers //查看指定资源的提供商和类型

validate //

meta //显示一个RA的源信息

cd //返回上一层

help

quit

up //返回上一层

如何获取一个命令的详细信息？

crm(live)ra# help list //获取list命令的详细使用信息

List RA for a class (and provider)

List available resource agents for the given class. If the class

is ocf, supply a provider to get agents which are available

only from that provider.

Usage:

list []

Example:

list ocf pacemaker

crm(live)ra# classes //查看RA类型

lsb //lsb类别

ocf / heartbeat pacemaker //ocf 有两个提供商heartbeat和pacemaker

service

stonith //stonith类别

crm(live)ra# list ocf pacemaker //显示ocf类型下由pacemaker提供的所有RA

ClusterMon Dummy HealthCPU HealthSMART Stateful SysInfo SystemHealth controld ping pingd remote

crm(live)ra# list lsb //显示所有lsb类型所提供的所有RA

auditd blk-availability corosync corosync-notifyd crond halt heartbeat htcacheclean

crm(live)ra# help meta //meta用来显示一个RA的源信息

Usage:

info [:[:]] 哪一个类型：哪一个提供商：哪一个资源代理（RA）

info [] (obsolete)

如：

info apache

info ocf:pacemaker:Dummy //ocf类型：pacemaker所提供的：Dummy为资源代理

info stonith:ipmilan

info pengine

crm(live)ra# meta ocf:heartbeat:IPaddr //查看ocf类别由heartbeat提供资源代理微IPaddr的源信息

Parameters (*: required, []: default): //带*的为必须的，[ ]为默认的

ip* (string): IPv4 or IPv6 address //ip必须有

The IPv4 (dotted quad notation) or IPv6 address (colon hexadecimal notation)

example IPv4 "192.168.1.1".

example IPv6 "2001:db8:DC28:0:0:FC57:D4C8:1FFF".

nic (string): Network interface

......................

........................

Operations' defaults (advisory minimum) //对资源来说，建议的监控最小默认值

start timeout=20s //启动资源时最多等待20秒

stop timeout=20s //停止资源时最多等待20秒

status timeout=20s interval=10s

monitor timeout=20s interval=10s //每隔10秒检测一次，若梅检测到等待20秒，否则资源转移

如何得知一个RA是有谁提供的？

在ra子模式下用providers命令可以如？

crm(live)ra# providers IPaddr //查看IPaddr这个资源的提供商，有heartbeat提供

heartbeat

___________________________________________________________________________________________

配置资源

crm(live)ra# cd

crm(live)# configure

crm(live)configure# primitive webip ocf:heartbeat:IPaddr params ip=192.168.139.10 nic=eth0 cidr_netmask=24

primitive定义主资源 webip为资源名称 ocf资源类别：heartbeat为provider：IPaddr为RA

params指定参数 ip 192.168.139.10（必须有） nic=eth0 （默认就是eth0）cidr_netmask=24 （掩码24）

crm(live)configure# show

node www.rs1.com

node www.rs2.com

primitive webip IPaddr \

params ip=192.168.139.10 nic=eth0 cidr_netmask=24

property cib-bootstrap-options: \

dc-version=1.1.14-8.el6_8.1-70404b0 \

cluster-infrastructure="classic openais (with plugin)" \

expected-quorum-votes=2 \

stonith-enabled=false

crm(live)configure# verify //看有没有错误

crm(live)configure# commit //无错误后提交

crm(live)configure# show xml //也可以查看xml格式的配置，更加详细

crm(live)configure# cd

crm(live)#

crm(live)# status //此时资源其实已经开始运行，查看资源运行情况

Online: [ www.rs1.com www.rs2.com ]

Full list of resources:

webip(ocf::heartbeat:IPaddr):Started www.rs1.com \\可以看到rs1被选为了DC，资源webip运行 \\在www.rs1.com上

___________________________________________________________________________________________

192.168.139.2

[root@www corosync]# ip addr show //可以看到VIP192.168.139.10在eth0:0上

2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000

inet 192.168.139.2/24 brd 192.168.139.255 scope global eth0

inet 192.168.139.10/24 brd 192.168.139.255 scope global secondary eth0

[root@www .ssh]# crm

crm(live)# resource

crm(live)resource# stop webip //停止webip资源

crm(live)resource# list

webip(ocf::heartbeat:IPaddr):(target-role:Stopped) Stopped

crm(live)resource# start webip

crm(live)resource# list

webip(ocf::heartbeat:IPaddr):Started

crm(live)resource# migrate webip //有风险实验迁移资源报错，用强制方法后webip资源启动不了，只能重启corosync

ERROR: resource.move: No target node: Move requires either a target node or 'force'

用status,可以看到如下错误

* webip_start_0 on www.rs2.com 'not configured' (6): call=12, status=complete, exitreason='none',

last-rc-change='Sat Oct 29 08:55:24 2016', queued=1ms, exec=250ms

最后发现我rs2主机是克隆的，上面没有eth0网卡，只有eth1，而webip是定义在eth0上的（^_^）最后将eth1网卡改为了eth0,然后重启操作系统好了，以下是一个改网卡名称的文章

http://www.linuxidc.com/Linux/2015-06/118969.htm

在定义一个httpd资源

_____________________________________________________________

192.168.139.4

[root@www corosync]# rpm -qa httpd //本机无httpd

[root@www corosync]# yum install httpd //直接yum装

[root@www html]# vim index.html

www.RS2.com

[root@www html]# service httpd stop

Stopping httpd: [ OK ]

[root@www html]# chkconfig httpd off //集群资源千万别让开机自启动

___________________________________________________________________________________________

192.168.139.2

[root@www corosync]# rpm -qa httpd //本机无httpd

[root@www corosync]# yum install httpd //直接yum装

[root@www html]# vim index.html \\编辑httpd主页面，以区别不同的主机

www.RS1.com

[root@www html]# service httpd stop

Stopping httpd: [ OK ]

[root@www html]# chkconfig httpd off \\集群资源千万不能开机自启动

___________________________________________________________________________________________

192.168.139.4

[root@www corosync]# rpm -qa httpd //本机无httpd

[root@www corosync]# yum install httpd //直接yum装

[root@www html]# vim index.html

www.RS2.com

[root@www html]# service httpd stop

Stopping httpd: [ OK ]

[root@www html]# chkconfig httpd off

___________________________________________________________________________________________

192.168.139.2

[root@www ~]# crm

crm(live)# cd resource

crm(live)resource# list

webip(ocf::heartbeat:IPaddr):Started

crm(live)resource# cd ..

crm(live)# cd ra

crm(live)ra# providers httpd //可以看到httpd无提供商

crm(live)ra# list lsb //httpd这个ra属于ocf类别

auditd blk-availability corosync corosync-notifyd crond halt htcacheclean httpd

crm(live)ra# meta lsb:httpd //且用meta可以看到无其他参数，只有一些Operation

start and stop Apache HTTP Server (lsb:httpd)

server implementing the current HTTP standards.

Operations' defaults (advisory minimum):

start timeout=15

stop timeout=15

status timeout=15

restart timeout=15

force-reload timeout=15

monitor timeout=15 interval=15

crm(live)ra# cd

crm(live)# configure

crm(live)configure# primitive httpd lsb:httpd op start timeout=20 \\定义httpd主资源

crm(live)configure# show

node www.rs1.com

node www.rs2.com

primitive httpd lsb:httpd \

op start timeout=20 interval=0

primitive webip IPaddr \

params ip=192.168.139.10 nic=eth0 cidr_netmask=24 \

meta target-role=Started

property cib-bootstrap-options: \

dc-version=1.1.14-8.el6_8.1-70404b0 \

cluster-infrastructure="classic openais (with plugin)" \

expected-quorum-votes=2 \

stonith-enabled=false

crm(live)configure# verify

crm(live)configure# commit

crm(live)configure# cd

crm(live)# status

Last updated: Sat Oct 29 10:39:04 2016Last change: Sat Oct 29 08:33:08 2016 by root via cibadmin on www.rs1.com

Stack: classic openais (with plugin)

Current DC: www.rs2.com (version 1.1.14-8.el6_8.1-70404b0) - partition with quorum

2 nodes and 2 resources configured, 2 expected votes

Online: [ www.rs1.com www.rs2.com ]

Full list of resources: \\可以看到webip运行在rs1，而httpd运行在rs2

webip(ocf::heartbeat:IPaddr):Started www.rs1.com

httpd(lsb:httpd):Started www.rs2.com

___________________________________________________________________________________________

192.168.139.4

[root@www ~]# netstat -tnlp |grep httpd

tcp 0 0 :::80 LISTEN 1718/httpd

浏览器访问192.168.139.4

___________________________________________________________________________________________

192.168.139.2

将两个资源定义为一个组，让一起运行在同一个节点

crm(live)configure# help group \\不懂就help

Define a group

Usage:

group [...]

\\group 组名资源1 资源2 还可以描述组description，定义组的params，及meta属性，组的params有哪些要查官方文档

[description=] \\描述

[meta attr_list] \\meta属性

[params attr_list] \\组的params

attr_list :: [$id=] = [=...] | $id-ref=

Example:

group internal_www disk0 fs0 internal_ip apache \

meta target_role=stopped

group vm-and-services vm vm-sshd meta container="vm" \\vm-and-service 组名 vm 资源1 vm-sshd 资源2 meta container="vm" meta属性

crm(live)configure# group webserver webip httpd \\webserver 组名 webip httpd为组中的两个资 \\源

crm(live)configure# verify

crm(live)configure# commit

crm(live)configure# show

node www.rs1.com

node www.rs2.com

primitive httpd lsb:httpd \

op start timeout=20 interval=0

primitive webip IPaddr \

params ip=192.168.139.10 nic=eth0 cidr_netmask=24 \

meta target-role=Started

group webserver webip httpd

property cib-bootstrap-options: \

dc-version=1.1.14-8.el6_8.1-70404b0 \

cluster-infrastructure="classic openais (with plugin)" \

expected-quorum-votes=2 \

crm(live)configure# cd

crm(live)# status

cibadmin on www.rs1.com

Stack: classic openais (with plugin)

Current DC: www.rs2.com (version 1.1.14-8.el6_8.1-70404b0) - partition with quorum

2 nodes and 2 resources configured, 2 expected votes

Online: [ www.rs1.com www.rs2.com ]

Full list of resources:

Resource Group: webserver \\资源组webserver定以后，两个资源会运行在一个节点上

webip(ocf::heartbeat:IPaddr):Started www.rs1.com

httpd(lsb:httpd):Started www.rs1.com

浏览器测试192.168.139.10

crm(live)# node

crm(live)node# standby \\让rs1成为备用节点，资源转移到rs2上

crm(live)node# cd

crm(live)# status \\资源成功从rs1转移到了rs2

Last updated: Sat Oct 29 11:32:08 2016Last change: Sat Oct 29 11:31:51 2016 by root via crm_attribute on www.rs1.com

Stack: classic openais (with plugin)

Current DC: www.rs1.com (version 1.1.14-8.el6_8.1-70404b0) - partition with quorum

*{这里为什么不是without qurum,难道standby后还可以投票？}

2 nodes and 2 resources configured, 2 expected votes

Node www.rs1.com: standby

Online: [ www.rs2.com ]

Full list of resources: \\并且rs1被standby后资源照样运行正常,应该是只剩下rs2后票数只有一票

Resource Group: webserver \\票数只有一票，没有超过一半，资源被stop

webip(ocf::heartbeat:IPaddr):Started www.rs2.com

httpd(lsb:httpd):Started www.rs2.com

crm(live)# node

crm(live)node# online \\让重新上线

crm(live)node# cd

crm(live)# status

Current DC: www.rs1.com (version 1.1.14-8.el6_8.1-70404b0) - partition with quorum

2 nodes and 2 resources configured, 2 expected votes

Online: [ www.rs1.com www.rs2.com ]

Full list of resources:

Resource Group: webserver \\重新上线，票数够了，资源又启动

webip(ocf::heartbeat:IPaddr):Started www.rs2.com

httpd(lsb:httpd):Started www.rs2.com

这次直接让rs2停掉

192.168.139.4

[root@www ~]# service corosync stop

Signaling Corosync Cluster Engine (corosync) to terminate: [ OK ]

Waiting for corosync services to unload:. [ OK ]

192.168.139.2

crm(live)# status

Last updated: Sat Oct 29 11:53:25 2016Last change: Sat Oct 29 11:52:39 2016 by root via crm_attribute on www.rs1.com

Stack: classic openais (with plugin)

Current DC: www.rs1.com (version 1.1.14-8.el6_8.1-70404b0) - partition WITHOUT quorum

{这次是without quorum 没有达到法定票数,看来只有停掉服务才不能投票，standby后仍然可以}

2 nodes and 2 resources configured, 2 expected votes

Online: [ www.rs1.com ]

OFFLINE: [ www.rs2.com ]

Full list of resources:

Resource Group: webserver \\票数没有到法定票数，默认会stop资源

webip(ocf::heartbeat:IPaddr):Stopped

httpd(lsb:httpd):Stopped

192.168.139.4

[root@www ~]# service corosync start

192.168.139.2

crm(live)# status

Last updated: Sat Oct 29 11:59:36 2016Last change: Sat Oct 29 11:52:39 2016 by root via crm_attribute on www.rs1.com

Stack: classic openais (with plugin)

Current DC: www.rs1.com (version 1.1.14-8.el6_8.1-70404b0) - partition with quorum

2 nodes and 2 resources configured, 2 expected votes

Online: [ www.rs1.com www.rs2.com ]

Full list of resources: \\rs2启动后，资源又启动了

Resource Group: webserver

webip(ocf::heartbeat:IPaddr):Started www.rs1.com

httpd(lsb:httpd):Started www.rs1.com

将不够法定票数时的默认操作改为ignore

crm(live)# configure

crm(live)configure# property no-quorum-policy=ignore

crm(live)configure# show

node www.rs1.com \

attributes standby=off

node www.rs2.com \

attributes standby=off

primitive httpd lsb:httpd \

op start timeout=20 interval=0

primitive webip IPaddr \

params ip=192.168.139.10 nic=eth0 cidr_netmask=24 \

meta target-role=Started

group webserver webip httpd

property cib-bootstrap-options: \

dc-version=1.1.14-8.el6_8.1-70404b0 \

cluster-infrastructure="classic openais (with plugin)" \

expected-quorum-votes=2 \

stonith-enabled=false \

no-quorum-policy=ignore

crm(live)configure# verify

crm(live)configure# commit

192.168.139.4

[root@www ~]# service corosync stop

192.168.139.2

crm(live)# status

Last updated: Sat Oct 29 12:03:53 2016Last change: Sat Oct 29 12:03:25 2016 by root via cibadmin on www.rs1.com

Stack: classic openais (with plugin)

Current DC: www.rs1.com (version 1.1.14-8.el6_8.1-70404b0) - partition WITHOUT quorum

{without quorum 不够法定票数}

2 nodes and 2 resources configured, 2 expected votes

Online: [ www.rs1.com ]

OFFLINE: [ www.rs2.com ]

Full list of resources: \\但是服务照样运行，因为ignore

Resource Group: webserver

webip(ocf::heartbeat:IPaddr):Started www.rs1.com

httpd(lsb:httpd):Started www.rs1.com

192.168.139.4

[root@www ~]# service corosync start

[root@www ~]# crm

crm(live)# node

crm(live)node# standby

crm(live)node# cd

crm(live)# status

Last updated: Sat Oct 29 10:03:51 2016Last change: Sat Oct 29 10:03:46 2016 by root via crm_attribute on www.rs1.com

Stack: classic openais (with plugin)

Current DC: www.rs1.com (version 1.1.14-8.el6_8.1-70404b0) - partition with quorum

{此处仍然够票数，看来standby后仍然可以投票是对的}

2 nodes and 2 resources configured, 2 expected votes

Node www.rs2.com: standby

Online: [ www.rs1.com ]

Full list of resources:

Resource Group: webserver \\已经为ignore,票数够不够资源都运行

webip(ocf::heartbeat:IPaddr):Started www.rs1.com

httpd(lsb:httpd):Started www.rs1.com

crm(live)# node

crm(live)node# online

不用定义组直接用约束，让资源在一起运行

crm(live)# resource

crm(live)resource# stop webserver

crm(live)resource# cleanup webserver

crm(live)resource# cd

crm(live)# configure

crm(live)configure# delete webserver

crm(live)configure# show

node www.rs1.com \

attributes standby=off

node www.rs2.com \

attributes standby=off

primitive httpd lsb:httpd \

op start timeout=20 interval=0

primitive webip IPaddr \

params ip=192.168.139.10 nic=eth0 cidr_netmask=24 \

meta target-role=Started

property cib-bootstrap-options: \

dc-version=1.1.14-8.el6_8.1-70404b0 \

cluster-infrastructure="classic openais (with plugin)" \

expected-quorum-votes=2 \

stonith-enabled=false \

no-quorum-policy=ignore \

last-lrm-refresh=1477714758

crm(live)configure# verify

crm(live)configure# commit

crm(live)# status

Online: [ www.rs1.com www.rs2.com ]

Full list of resources: \\可以看到两个资源又运行在不同节点上了

webip(ocf::heartbeat:IPaddr):Started www.rs1.com

httpd(lsb:httpd):Started www.rs2.com

定义colocation（资源与资源是否能运行在同一个节点，inf表示无穷大）

crm(live)# configure

crm(live)configure# colocation webip_with_httpd inf: webip httpd \\定义排列约束，约束两个资源

crm(live)configure# show

.........

colocation webip_with_httpd inf: webip httpd \\好像定义反了，这是httpd在哪，webip在哪；应该改为webip在哪，httpd在哪，谁在后谁做主

crm(live)configure# edit \\直接用edit编辑改

colocation webip_with_httpd inf: webip httpd

改为

colocation webip_with_httpd inf: httpd webip

crm(live)configure# show xml

crm(live)configure# commit

crm(live)configure# cd

crm(live)# status

.........

Online: [ www.rs1.com www.rs2.com ]

Full list of resources: \\两个资源又运行在了一个节点上

webip(ocf::heartbeat:IPaddr):Started www.rs1.com

httpd(lsb:httpd):Started www.rs1.com

这样就用 colocation 排列约束将两个资源绑定了，资源启动也有先后顺序，定义Order顺序约束

crm(live)# configure

crm(live)configure# help order

Usage:

order [{kind|}:] first then [symmetrical=]

order [{kind|}:] resource_sets [symmetrical=]

kind :: Mandatory | Optional | Serialize 强制的|随意的|连续

first :: [:] \\资源后还可以定义action，将一个资源启动后采取什么操作在启动另一个，这些操作在resource下如start stop promote......

then :: [:]

resource_sets :: resource_set [resource_set ...]

crm(live)configure# order webip_before_httpd mandatory: webip httpd \\webip_before_httpd 是id mandatory 是kind,还可以是score： webip先启动 httpd后启动

crm(live)configure# commit

crm(live)configure# show xml

first webip,then httpd

crm(live)configure# cd

crm(live)# status

Online: [ www.rs1.com www.rs2.com ]

Full list of resources: \\当前在rs1上运行

webip(ocf::heartbeat:IPaddr):Started www.rs1.com

httpd(lsb:httpd):Started www.rs1.com

crm(live)# node

crm(live)node# standby \\让rs1变为standby

crm(live)node# cd

crm(live)# status

Node www.rs1.com: standby

Online: [ www.rs2.com ]

Full list of resources: \\切换太快，没看出谁先启动的（^_^），反正资源转移了

webip(ocf::heartbeat:IPaddr):Started www.rs2.com

httpd(lsb:httpd):Started www.rs2.com

crm(live)# node

crm(live)node# online \\让rs1再上线

crm(live)node# cd

crm(live)# status

Online: [ www.rs1.com www.rs2.com ]

Full list of resources: \\但是资源没有回来

webip(ocf::heartbeat:IPaddr):Started www.rs2.com

httpd(lsb:httpd):Started www.rs2.com

如果想让上线后资源又转移回来怎么办？

定义location,位置约束（资源倾向运行在哪个节点）

crm(live)# configure

crm(live)configure# help location

Usage:

location [] {|}

........

node_pref :: :

rules :: \\规则可以用表达式定义

rule [id_spec] [$role=] :

[rule [id_spec] [$role=] : ...]

location conn_1 internal_www \ conn_1 是id/名称 internal_www 是资源名

rule 50: #uname eq node1 \ 规则为当uname等于node1时分数为50

crm(live)configure# location wibip_on_rs1 webip rule 100: #uname eq www.rs1.com

\\当uname等于www.rs1.com时location的分数为100

crm(live)configure# verify

crm(live)configure# commit

crm(live)configure# show xml

crm(live)configure# cd

crm(live)# status

Online: [ www.rs1.com www.rs2.com ]

Full list of resources: \\location已经生效所以资源自动转移到了rs1

webip(ocf::heartbeat:IPaddr):Started www.rs1.com

httpd(lsb:httpd):Started www.rs1.com

crm(live)# node

crm(live)node# standby \\rs1转为备节点

crm(live)node# cd

crm(live)# status

Node www.rs1.com: standby

Online: [ www.rs2.com ]

Full list of resources:\\资源转移到了rs2

webip(ocf::heartbeat:IPaddr):Started www.rs2.com

httpd(lsb:httpd):Started www.rs2.com

crm(live)# node

crm(live)node# online

crm(live)node# cd

crm(live)# status

Online: [ www.rs1.com www.rs2.com ]

Full list of resources: \\rs1上线后资源从rs2转移回来了

webip(ocf::heartbeat:IPaddr):Started www.rs1.com

httpd(lsb:httpd):Started www.rs1.com

为资源定义粘性（资源是否倾向运行在当前节点）

crm(live)# configure

crm(live)configure# rsc_defaults resource-stickiness=200 \\定义资源的粘性为200

crm(live)configure# verify

crm(live)configure# commit

crm(live)configure# show xml

crm(live)configure# cd

crm(live)# node standby

crm(live)# status

Node www.rs1.com: standby

Online: [ www.rs2.com ]

Full list of resources: \\资源转移到了rs2

webip(ocf::heartbeat:IPaddr):Started www.rs2.com

httpd(lsb:httpd):Started www.rs2.com

crm(live)# node online \\重新上线

crm(live)# status

Online: [ www.rs1.com www.rs2.com ]

Full list of resources: \\因为粘性stickiness（200）大于倾向性location（100），所以资源不会 \\再转移回rs1

webip(ocf::heartbeat:IPaddr):Started www.rs2.com

httpd(lsb:httpd):Started www.rs2.com

再加一个FileSystem，及192.168.139.8 NFS-Server，共享一个主页面让无论哪个节点运行资源，其通过浏览器访问的页面相同

_____________________________________________________________

192.168.139.8

[root@www ~]# vim /etc/exports

/web/htdocs 192.168.139.0/24 (ro)

[root@www local]# cd /web/htdocs/

[root@www htdocs]# vim index.html

www.NFS.com

[root@www ~]# service iptables stop

[root@www ~]# service nfs start

___________________________________________________________________________________________
192.168.139.4

root@www ~]# mount 192.168.139.8:/web/htdocs /mnt

[root@www ~]# cd /mnt

[root@www mnt]# ll

total 4

-rw-r--r--. 1 nobody nobody 21 Nov 12 2016 index.html

[root@www mnt]# cd

[root@www ~]# umount /mnt/

[root@www ~]# crm

crm(live)# ra

crm(live)ra# list ocf \\Filesystem属于ocf类别

Filesystem HealthCPU HealthSMART IPaddr

crm(live)ra# providers Filesystem \\Filesystem由heartbeat提供

heartbeat

crm(live)ra# meta ocf:heartbeat:Filesystem

device* (string): block device \\ddevice必须有

The name of block device for the filesystem, or -U, -L options for mount, or NFS mount specification.

directory* (string): mount point \\挂载点必须有

The mount point for the filesystem.

fstype* (string): filesystem type \\文件系统必须有

The type of filesystem to be mounted.

options (string): \\-o 指定挂载时的操作

Any extra options to be given as -o options to mount.

For bind mounts, add "bind" here and set fstype to "none".

We will do the right thing for options such as "bind,ro".

crm(live)ra# cd

crm(live)# configure

crm(live)configure# primitive nfs ocf:heartbeat:Filesystem params device=192.168.139.8:/web/htdocs/ directory=/var/www/html/ fstype=nfs op monitor timeout=60s

crm(live)configure# verify

crm(live)configure# commit

crm(live)configure# show

primitive nfs Filesystem \

params device="192.168.139.8:/web/htdocs/" directory="/var/www/html/" fstype=nfs \

primitive webip IPaddr \

params ip=192.168.139.10 nic=eth0 cidr_netmask=24 \

order webip_before_httpd Mandatory: webip httpd

colocation webip_with_httpd inf: httpd webip

location wibip_on_rs1 webip \

rule 100: #uname eq www.rs1.com \

expected-quorum-votes=2 \

stonith-enabled=false \

no-quorum-policy=ignore \

last-lrm-refresh=1477714758

rsc_defaults rsc-options: \

resource-stickiness=200

crm(live)configure# cd

crm(live)# status

Online: [ www.rs1.com www.rs2.com ]

Full list of resources: \\可以看到三个资源都启动了，webip和httpd在一起都运行在rs2上，而nfs \\运行在rs1上，并且

webip(ocf::heartbeat:IPaddr):Started www.rs2.com

httpd(lsb:httpd):Started www.rs2.com

nfs(ocf::heartbeat:Filesystem):Started www.rs1.com

___________________________________________________________________________________________

192.168.139.2

[root@www ~]# cd /var/www/html/

[root@www html]# ll

total 4

-rw-r--r--. 1 nobody nobody 21 Nov 12 2016 index.html

[root@www html]# vim index.html

www.NFS.com

\\NFS共享的页面已经挂载了

如何让三个资源运行在一个节点上？

为Filestytem定义location和order

crm(live)configure# colocation nfs_with_webip inf: nfs webip \\nfs跟随webip,webip在哪nfs \\在哪

crm(live)configure# order webip_before_nfs mandatory: webip nfs \\先启动webip，再启动nfs

crm(live)configure# verify

crm(live)configure# commit

crm(live)configure# show

primitive nfs Filesystem \

params device="192.168.139.8:/web/htdocs/" directory="/var/www/html/" fstype=nfs \

op monitor timeout=60s interval=0

primitive webip IPaddr \

params ip=192.168.139.10 nic=eth0 cidr_netmask=24 \

colocation nfs_with_webip inf: nfs webip

order webip_before_httpd Mandatory: webip httpd

colocation webip_with_httpd inf: httpd webip

location wibip_on_rs1 webip \

rule 100: #uname eq www.rs1.com

expected-quorum-votes=2 \

stonith-enabled=false \

no-quorum-policy=ignore \

resource-stickiness=200

crm(live)configure# show xml

crm(live)# status

2 nodes and 3 resources configured, 2 expected votes \\三个资源两个节点，期望票数为两票

Online: [ www.rs1.com www.rs2.com ]

Full list of resources: \\可以看到所有的资源都在rs2上了，因为资源黏性200，webip在rs1上location只有100，且在未配置Filesystem前，webip和httpd都运行在rs2上，所以现在三个资源都在rs2上

webip(ocf::heartbeat:IPaddr):Started www.rs2.com

httpd(lsb:httpd):Started www.rs2.com

nfs(ocf::heartbeat:Filesystem):Started www.rs2.com

crm(live)# q

bye

[root@www html]# mount \\rs2上可以看到nfs已经挂载

192.168.139.8:/web/htdocs/ on /var/www/html type nfs (rw,vers=4,addr=192.168.139.8,clientaddr=192.168.139.4)

[root@www html]# cd /var/www/html/

[root@www html]# ll

total 4

-rw-r--r--. 1 nobody nobody 21 Nov 12 2016 index.html

[root@www html]# vim index.html \\可以看到NFS-Server共享的页面

www.NFS.com

浏览器测试

[root@www html]# crm

crm(live)# node

crm(live)node# standby \\让rs2 standby

crm(live)# status

Online: [ www.rs1.com www.rs2.com ]

Full list of resources: \\资源全部转移到了rs1

webip (ocf::heartbeat:IPaddr): Started www.rs1.com

httpd (lsb:httpd): Started www.rs1.com

nfs (ocf::heartbeat:Filesystem): Started www.rs1.com

浏览器访问，仍然是www.NFS.com 无论访问哪个节点，web页面一样

你可能感兴趣的:(集群,高可用,Corosync)

分布式资源管理和调度架构 johnny233 架构架构
概述不管是计算任务还是数据存储都会涉及资源分配，资源包括但不限于硬件资源如CPU、内存、硬盘、网口。在单机环境中，资源管理相对简单；分布式环境中，资源分布相对分散，如何协调资源应对计算任务和数据存储就是亟待解决的问题。资源管理和调度是将计算任务分配到资源的过程，为了处理并发的计算任务，系统会通过集群的方式组织资源。集群中的资源可以按照服务器或者虚拟机的方式划分。注：本文是《分布式架构原理与实践》的
Kafka消息轨迹方案设计与实现小马不敲代码大数据 kafka
在处理过的几个千万级TPS的Kafka集群中，消息追踪始终是一个既重要又棘手的问题。一条消息从Producer发出后，经过复杂的处理流程，最终被Consumer消费，中间可能会经历重试、重平衡、多副本复制等多个环节。如果没有完善的追踪机制，一旦出现问题将很难定位。本文将详细介绍Kafka消息轨迹的实现方案。1、Kafka消息处理模型在设计追踪方案前，我们需要先理解Kafka的消息处理模型。一条消息
hive电影数据分析系统 Springboot协同过滤-余弦函数推荐系统爬虫2万+数据大屏数据展示 + [手把手视频教程和开发文档] QQ-1305637939 毕业设计大数据毕设计算机毕业设计 hive spring boot 爬虫
hive电影数据分析Springboot协同过滤-余弦函数推荐系统爬虫2万+数据大屏数据展示+[手把手视频教程和开发文档]【功能介绍】1.java爬取【豆瓣电影】网站中电影数据,保存为data.csv文件,数据量2万+2.data.csv上传到hadoop集群环境3.MR数据清洗data.csv4.Hive汇总处理,将Hive处理的结果数据保存到本地Mysql数据库中5.Springboot+Vu
hadoop电影数据分析系统 Springboot协同过滤-余弦函数推荐系统爬虫2万+数据大屏数据展示 + [手把手视频教程和开发文档] QQ-1305637939 计算机毕业设计毕业设计大数据毕设 hadoop spring boot 爬虫
全套视频教程全套开发文档hadoop电影数据分析系统Springboot协同过滤-余弦函数推荐系统爬虫2万+数据大屏数据展示【Hadoop项目】1.java爬取【豆瓣电影】网站中电影数据,保存为data.csv文件,数据量2万+2.data.csv上传到hadoop集群环境3.data.csv数据清洗4.MR数据汇总处理,将Reduce的结果数据保存到本地Mysql数据库中5.Springboot
spark电影数据分析系统 Springboot协同过滤-余弦函数推荐系统爬虫2万+数据大屏数据展示 + [手把手视频教程和开发文档] QQ-1305637939 毕业设计大数据毕设计算机毕业设计 spark spring boot 爬虫大数据电影推荐电影分析
spark电影数据分析系统Springboot协同过滤-余弦函数推荐系统爬虫2万+数据大屏数据展示+[手把手视频教程和开发文档【功能介绍】1.java爬取【豆瓣电影】网站中电影数据,保存为data.csv文件,数据量2万+2.data.csv上传到hadoop集群环境3.MR数据清洗data.csv4.Spark汇总处理,将Spark处理的结果数据保存到本地Mysql数据库中5.Springboo
SeaTunnel 与 DataX 、Sqoop、Flume、Flink CDC 对比不二人生 #数据集成工具 SeaTunnel
文章目录SeaTunnel与DataX、Sqoop、Flume、FlinkCDC对比同类产品横向对比2.1、高可用、健壮的容错机制2.2、部署难度和运行模式2.3、支持的数据源丰富度2.4、内存资源占用2.5、数据库连接占用2.6、自动建表2.7、整库同步2.8、断点续传2.9、多引擎支持2.10、数据转换算子2.11、性能2.12、离线同步2.13、增量同步&实时同步2.14、CDC同步2.15
MySQL有哪些高可用方案? java1234_小锋 mysql mysql 数据库
大家好，我是锋哥。今天分享关于【RMySQL有哪些高可用方案?】面试题。希望对大家有帮助；MySQL有哪些高可用方案?1000道互联网大厂Java工程师精选面试题-Java资源分享网MySQL的高可用方案可以帮助确保数据库在发生故障时仍能持续提供服务，避免单点故障带来的影响。以下是一些常见的MySQL高可用方案：1.主从复制（Master-SlaveReplication）概述：主从复制是最常见的
ctr、crictl和nerdctl命令介绍与常用命令列表篙芷容器
ctr、crictl和nerdctl命令区分ctr工具ctr是containerd提供的官方CLI（命令行工具），主要用于与containerd守护进程交互。它允许用户直接操作容器、镜像和任务等资源，是containerd的核心管理工具之一。crictl工具crictl是一个遵循CRI（ContainerRuntimeInterface）规范的命令行工具，用于检查和调试Kubernetes集群中的
RocketMQ的集群架构是怎样的? java1234_小锋 java java-rocketmq rocketmq 架构
大家好，我是锋哥。今天分享关于【RocketMQ的集群架构是怎样的?】面试题。希望对大家有帮助；RocketMQ的集群架构是怎样的?1000道互联网大厂Java工程师精选面试题-Java资源分享网RocketMQ是阿里巴巴开源的分布式消息中间件，广泛用于处理高吞吐量、高可用的消息队列服务。它的集群架构设计非常注重高可用性、可扩展性和高效性。以下是RocketMQ的集群架构主要组件和工作原理：1.集
kafka学习笔记2 —— 筑梦之路筑梦之路 Java技术 linux系统运维 kafka 学习笔记
KRaft模式Kafka的KRaft模式是一种新的元数据管理方式，旨在去除对ZooKeeper的依赖，使Kafka成为一个完全自包含的系统。在Kafka的传统模式下，元数据管理依赖于ZooKeeper，这增加了部署和运维的复杂性。为了解决这个问题，Kafka社区引入了KRaft模式。在KRaft模式下，所有的元数据，包括主题、分区信息、副本位置等，都被存储在Kafka集群内部的特殊日志中。这个日志
深度挖掘：Oracle RAC数据库架构分析与实战攻略拟声的主扬专题数据库 oracle rac 性能优化网络备份
深度挖掘：OracleRAC数据库架构分析与实战攻略本书内容从集群概念入手，深入RAC原理和结构进行分析，结合存储和网络传输知识，全面讲解小机集群数据库RAC的安装示例，探讨RAC的管理和维护，详述备份恢复，并从故障诊断方法展开，细说性能优化的几个方面，再到Oracle最高可用架构的延伸。对其内容详细阅读
自建 MongoDB 实战 | MongoDB 文档查询新钛云服 mongodb 数据库 nosql
新钛云服已累计为您分享703篇技术干货专题介绍：八篇文章，近五万字。自建MongoDB实践系列文章，为您阐述日常工作中常用的NoSQL产品——MongoDB运维相关的日常实战。主要涉及到：·MongoDB的安装及基本使用（点击进入）·MongoDB文档查询（本期内容）·MongoDB复制集的介绍及搭建（后续更新）·MongoDB分片集群的介绍及搭建（后续更新）·MongoDB的备份及恢复（后续更新
分布式二级缓存组件实战（Redis+Caffeine实现）鸨哥学JAVA 程序员 Java 编程 redis 缓存分布式
所谓二级缓存缓存就是将数据从读取较慢的介质上读取出来放到读取较快的介质上，如磁盘-->内存。平时我们会将数据存储到磁盘上，如：数据库。如果每次都从数据库里去读取，会因为磁盘本身的IO影响读取速度，所以就有了像redis这种的内存缓存。可以将数据读取出来放到内存里，这样当需要获取数据时，就能够直接从内存中拿到数据返回，能够很大程度的提高速度。但是一般redis是单独部署成集群，所以会有网络IO上的消
oceanbase架构、功能模块、数据存储、特性、sql流转层等概念详解小成很成数据库
一、架构图OceanBase数据库采用无共享（Shared-Nothing）分布式集群架构，各个节点之间完全对等，每个节点都有自己的SQL引擎、存储引擎、事务引擎，运行在普通PC服务器组成的集群之上，具备高可扩展性、高可用性、高性能、低成本、与主流数据库高兼容等核心特性。OceanBase数据库的一个集群由若干个节点组成。这些节点分属于若干个可用区（Zone），每个节点属于一个可用区。可用区是一个
nginx反向代理kafka集群实现内外网隔离访问 —— 筑梦之路筑梦之路 linux系统运维大数据 nginx kafka 运维
背景说明我们在使用Kafka客户端连接到Kafka集群时，即使连接的节点只配置了一个集群的Broker地址，该Broker将返回给客户端集群所有节点的信息列表。然后客户端使用该列表信息（Topic的分区信息）再与集群进行数据交互。这里Kafka列表信息为服务配置文件service.properties中advertised.listeners配置项中的信息。例如：advertised.listen
k8s部署Kafka集群潞哥的博客 kubernetes kafka 容器
1.1、Kafka(消息队列)是一个分布式消息中间件,支持分区的、多副本的、多订阅者的、基于zookeeper协调的分布式消息系统。通俗来说：kafka就是一个存储系统，存储的数据形式为“消息"；1.2、常用的消息系统有哪些以及各自的特点有activemq，rabbitmq，rocketmq，kafka1.3、为什么使用消息队列1)、提高扩展性：因为消息队列解耦了处理过程，有新增需求时只要另外增加
云原生周刊：K8s 生产环境架构设计及成本分析 KubeSphere 云原生 k8s 容器平台 kubesphere 云计算
开源项目推荐KubeZoneNetKubeZoneNet旨在帮助监控和优化Kubernetes集群中的跨可用区（Cross-Zone）网络流量。这个项目提供了一种简便的方式来跟踪和分析Kubernetes集群中跨不同可用区的通信，帮助用户优化集群的网络架构、提高资源利用效率并减少网络延迟。通过实时监控和数据分析，KubeZoneNet能有效地识别跨可用区的网络瓶颈，并提供改进建议，以支持Kuber
大数据学习(37)- Flink运行时架构 viperrrrrrr 学习 flink 大数据
&&大数据学习&&系列专栏：哲学语录:承认自己的无知，乃是开启智慧的大门如果觉得博主的文章还不错的话，请点赞+收藏⭐️+留言支持一下博主哦1）作业管理器（JobManager）JobManager是一个Flink集群中任务管理和调度的核心，是控制应用执行的主进程。也就是说，每个应用都应该被唯一的JobManager所控制执行。JobManger又包含3个不同的组件。（1）JobMasterJobM
Apache SeaTunnel 2.3.9 正式发布：多项新特性与优化全面提升数据集成能力数据库
近日，ApacheSeaTunnel社区正式发布了最新版本2.3.9。本次更新新增了`Helm集群部署、Transform支持多表、Zeta新API、表结构转换、任务提交队列、分库分表合并、列转多行`等多个功能更新！作为一款开源、分布式的数据集成平台，本次版本通过新增功能、性能优化与问题修复，为开发者与企业用户带来了更加全面的支持。2.3.9版本下载：https://seatunnel.apach
百万架构师第二十四课：漫谈分布式架构：分布式架构设计｜JavaGuide 后端
主流架构模型-SOA架构和微服务架构领域驱动设计及业务驱动划分。分布式架构的基本理论CAP、BASE以及应用什么是分布式架构下的高可用设计分布式架构下的可伸缩设计构建高性能的分布式架构SOA架构和微服务架构ServiceOrientedArchitecture面向服务的架构，是架构模型，不是解决方案，是一种设计方法在这种方法下，有多个服务，而服务之间是相互依赖的或者通过一定的通讯机制去完成通讯的。
【开源免费】kettle作业调度—自动化运维—数据挖掘—informatica-批量作业工具taskctl 加菲盐008 Kettle ETL作业调度工具 taskctl 运维数据库 linux 大数据数据挖掘
关注公众号"taskctl"，关键字回复"领取"即可获权产品简介taskctl是一款由成都塔斯克信息技术公司历经10年研发的etl作业集群调度工具，该产品概念新颖，体系完整、功能全面、使用简单、操作流畅，它不仅有完整的调度核心、灵活的扩展，同时具备完整的应用体系。目前已获得金融，政府，制造，零售，健康，互联网等领域1000多家头部客户认可。图片来自网络2020年疫情席卷全球，更是对整个市场经济造成
k8s部署rabbitmq集群（使用rabbitmq-cluster-operator部署）仇誉 rabbitmq rabbitmq kubernetes
1.下载并安装cluster-operatorkubectlapply-frabbitmq-cluster-operator.yml百度网盘请输入提取码：qy992.部署rabbitmq实例kubectlapply-frabbitmq.yaml存储类改为自己的（如：managed-nfs-storage）#rabbitmq.yaml---apiVersion:rabbitmq.com/v1beta
hadoop常用命令我要用代码向我喜欢的女孩表白 hadoop npm 大数据
Yarn查看提交到资源调度器的任务（任何用yarn资源的都可以看，比如spark、tez、mapreduce）看正在运行的yarn任务yarnapplication-list杀死对应的yarn任务yarnapplication-kill{application_Id}（id可以通过-list看到）hdfs查看hdfs目录hdfsdfs-ls/（查看本集群的目录）hdfsdfs-lshdfs://i
Kubernetes架构原则和对象设计（二） grahamzhu 云原生学习专栏 kubernetes 架构容器集群搭建 API设计云计算 kubelet
云原生学习路线导航页（持续更新中）kubernetes学习系列快捷链接Kubernetes架构原则和对象设计（一）Kubernetes常见问题解答本文从云计算架构发展入手，详细分析了kubernetes的生态系统、设计理念、分层架构、API设计原则、架构设计原则等，并介绍了使用kubelet+staticPod拉起集群的过程1.云计算的传统分类云计算出现之前，对于任何企业，想要搭建自己的服务，需要
数据库高可用方案-09-数据库的灾难恢复演练老马啸西风 database mysql 数据库 oracle
数据库数据高可用系列数据库高可用方案-01-数据库备份还原方案数据库高可用方案-02-多机房部署数据库高可用方案-03-主备等高可用架构数据库高可用方案-04-删除策略数据库高可用方案-05-备份与恢复数据库高可用方案-06-监控与报警数据库高可用方案-07-一致性校验数据库高可用方案-08-多版本管理数据库高可用方案-09-数据库的灾难恢复演练数据库的灾难恢复演练数据库的灾难恢复演练是确保数据库
数据库高可用方案-05-备份与恢复老马啸西风 database mysql 数据库 oracle
数据库数据高可用系列数据库高可用方案-01-数据库备份还原方案数据库高可用方案-02-多机房部署数据库高可用方案-03-主备等高可用架构数据库高可用方案-04-删除策略数据库高可用方案-05-备份与恢复数据库高可用方案-06-监控与报警数据库高可用方案-07-一致性校验数据库高可用方案-08-多版本管理数据库高可用方案-09-数据库的灾难恢复演练数据库的备份与恢复数据库备份与恢复是数据库管理中至关
数据库高可用方案-07-一致性校验老马啸西风 database mysql 数据库 oracle
数据库数据高可用系列数据库高可用方案-01-数据库备份还原方案数据库高可用方案-02-多机房部署数据库高可用方案-03-主备等高可用架构数据库高可用方案-04-删除策略数据库高可用方案-05-备份与恢复数据库高可用方案-06-监控与报警数据库高可用方案-07-一致性校验数据库高可用方案-08-多版本管理数据库高可用方案-09-数据库的灾难恢复演练数据库的数据一致性校验数据库的数据一致性校验是指确保
数据库高可用方案-03-主备等高可用架构老马啸西风 database mysql 数据库架构
数据库数据高可用系列数据库高可用方案-01-数据库备份还原方案数据库高可用方案-02-多机房部署数据库高可用方案-03-主备等高可用架构数据库高可用方案-04-删除策略数据库高可用方案-05-备份与恢复数据库高可用方案-06-监控与报警数据库高可用方案-07-一致性校验数据库高可用方案-08-多版本管理数据库高可用方案-09-数据库的灾难恢复演练主备高可用架构主备高可用架构（Master-Slav
kubernetes 集群搭建(二进制方式) 難釋懷 kubernetes 容器云原生
Kubernetes作为当今最流行的容器编排平台，提供了强大的功能来管理和扩展容器化应用。除了使用kubeadm等工具简化集群的创建过程外，直接通过二进制文件安装Kubernetes组件也是一种常见的方法。这种方式给予用户更多的控制权，并且适用于那些希望深入理解Kubernetes内部工作原理的人。本文将详细介绍如何通过二进制方式搭建一个功能齐全的Kubernetes集群，并分享一些实用技巧和注意
Mysql数据库和Sql语句 Jessica小戴数据库 mysql sql
数据库管理：sql语句：数据库用来增删改查的语句（重要）备份：数据库的数据进行备份主从复制、读写分离、高可用（重要）Mysql数据库和Sql语句一、Mysql数据库1、数据库：组织、存储、管理数据的仓库2、数据库的管理系统（DBMS）：实现对数据有效组织、管理和存取的系统软件3、数据库软件：mysql、oracle（大数据系统一般使用、大企业使用）、sql-server、MariaDB也是mysq
多线程编程之join()方法周凡杨 java JOIN 多线程编程线程
现实生活中，有些工作是需要团队中成员依次完成的，这就涉及到了一个顺序问题。现在有T1、T2、T3三个工人，如何保证T2在T1执行完后执行，T3在T2执行完后执行？问题分析：首先问题中有三个实体，T1、T2、T3，因为是多线程编程，所以都要设计成线程类。关键是怎么保证线程能依次执行完呢？ Java实现过程如下： public class T1 implements Runnabl
java中switch的使用 bingyingao java enum break continue
java中的switch仅支持case条件仅支持int、enum两种类型。用enum的时候，不能直接写下列形式。 switch (timeType) { case ProdtransTimeTypeEnum.DAILY: break; default: br
hive having count 不能去重 daizj hive 去重 having count 计数
hive在使用having count()是，不支持去重计数 hive (default)> select imei from t_test_phonenum where ds=20150701 group by imei having count(distinct phone_num)>1 limit 10; FAILED: SemanticExcep
WebSphere对JSP的缓存周凡杨 WAS JSP 缓存
对于线网上的工程，更新JSP到WebSphere后，有时会出现修改的jsp没有起作用，特别是改变了某jsp的样式后，在页面中没看到效果，这主要就是由于websphere中缓存的缘故，这就要清除WebSphere中jsp缓存。要清除WebSphere中JSP的缓存，就要找到WAS安装后的根目录。现服务
设计模式总结朱辉辉33 java 设计模式
1.工厂模式 1.1 工厂方法模式 (由一个工厂类管理构造方法) 1.1.1普通工厂模式(一个工厂类中只有一个方法) 1.1.2多工厂模式(一个工厂类中有多个方法) 1.1.3静态工厂模式(将工厂类中的方法变成静态方法) &n
实例：供应商管理报表需求调研报告老A不折腾 finereport 报表系统报表软件信息化选型
引言随着企业集团的生产规模扩张，为支撑全球供应链管理，对于供应商的管理和采购过程的监控已经不局限于简单的交付以及价格的管理，目前采购及供应商管理各个环节的操作分别在不同的系统下进行，而各个数据源都独立存在，无法提供统一的数据支持；因此，为了实现对于数据分析以提供采购决策，建立报表体系成为必须。业务目标 1、通过报表为采购决策提供数据分析与支撑 2、对供应商进行综合评估以及管理，合理管理和
mysql 林鹤霄
转载源：http://blog.sina.com.cn/s/blog_4f925fc30100rx5l.html mysql -uroot -p ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: YES) [root@centos var]# service mysql
Linux下多线程堆栈查看工具(pstree、ps、pstack) aigo linux
原文：http://blog.csdn.net/yfkiss/article/details/6729364 1. pstree pstree以树结构显示进程$ pstree -p work | grep adsshd(22669)---bash(22670)---ad_preprocess(4551)-+-{ad_preprocess}(4552) &n
html input与textarea 值改变事件 alxw4616 JavaScript
// 文本输入框(input) 文本域(textarea)值改变事件 // onpropertychange(IE) oninput(w3c) $('input,textarea').on('propertychange input', function(event) { console.log($(this).val()) });
String类的基本用法百合不是茶 String
字符串的用法; // 根据字节数组创建字符串 byte[] by = { 'a', 'b', 'c', 'd' }; String newByteString = new String(by); 1,length() 获取字符串的长度 &nbs
JDK1.5 Semaphore实例 bijian1013 java thread java多线程 Semaphore
Semaphore类一个计数信号量。从概念上讲，信号量维护了一个许可集合。如有必要，在许可可用前会阻塞每一个 acquire()，然后再获取该许可。每个 release() 添加一个许可，从而可能释放一个正在阻塞的获取者。但是，不使用实际的许可对象，Semaphore 只对可用许可的号码进行计数，并采取相应的行动。 S
使用GZip来压缩传输量 bijian1013 java GZip
启动GZip压缩要用到一个开源的Filter：PJL Compressing Filter。这个Filter自1.5.0开始该工程开始构建于JDK5.0，因此在JDK1.4环境下只能使用1.4.6。 PJL Compressi
【Java范型三】Java范型详解之范型类型通配符 bit1129 java
定义如下一个简单的范型类， package com.tom.lang.generics; public class Generics<T> { private T value; public Generics(T value) { this.value = value; } }
【Hadoop十二】HDFS常用命令 bit1129 hadoop
1. 修改日志文件查看器 hdfs oev -i edits_0000000000000000081-0000000000000000089 -o edits.xml cat edits.xml 修改日志文件转储为xml格式的edits.xml文件，其中每条RECORD就是一个操作事务日志 2. fsimage查看HDFS中的块信息等 &nb
怎样区别nginx中rewrite时break和last ronin47
在使用nginx配置rewrite中经常会遇到有的地方用last并不能工作，换成break就可以，其中的原理是对于根目录的理解有所区别，按我的测试结果大致是这样的。 location / { proxy_pass http://test;
java-21.中兴面试题输入两个整数 n 和 m ，从数列 1 ， 2 ， 3.......n 中随意取几个数 , 使其和等于 m bylijinnan java
import java.util.ArrayList; import java.util.List; import java.util.Stack; public class CombinationToSum { /* 第21 题 2010 年中兴面试题编程求解：输入两个整数 n 和 m ，从数列 1 ， 2 ， 3.......n 中随意取几个数 , 使其和等
eclipse svn 帐号密码修改问题开窍的石头 eclipse SVN svn帐号密码修改
问题描述： Eclipse的SVN插件Subclipse做得很好，在svn操作方面提供了很强大丰富的功能。但到目前为止，该插件对svn用户的概念极为淡薄，不但不能方便地切换用户，而且一旦用户的帐号、密码保存之后，就无法再变更了。解决思路：删除subclipse记录的帐号、密码信息，重新输入
[电子商务]传统商务活动与互联网的结合 comsci 电子商务
某一个传统名牌产品，过去销售的地点就在某些特定的地区和阶层，现在进入互联网之后，用户的数量群突然扩大了无数倍，但是，这种产品潜在的劣势也被放大了无数倍，这种销售利润与经营风险同步放大的效应，在最近几年将会频繁出现。。。。如何避免销售量和利润率增加的
java 解析 properties-使用 Properties-可以指定配置文件路径 cuityang java properties
#mq xdr.mq.url=tcp://192.168.100.15:61618; import java.io.IOException; import java.util.Properties; public class Test { String conf = "log4j.properties"; private static final
Java核心问题集锦 darrenzhu java 基础核心难点
注意，这里的参考文章基本来自Effective Java和jdk源码 1)ConcurrentModificationException 当你用for each遍历一个list时，如果你在循环主体代码中修改list中的元素，将会得到这个Exception，解决的办法是： 1)用listIterator, 它支持在遍历的过程中修改元素， 2)不用listIterator, new一个
1分钟学会Markdown语法 dcj3sjt126com markdown
markdown 简明语法基本符号 *,-,+ 3个符号效果都一样，这3个符号被称为 Markdown符号空白行表示另起一个段落 `是表示inline代码，tab是用来标记代码段，分别对应html的code，pre标签换行单一段落( <p>) 用一个空白行连续两个空格会变成一个 <br> 连续3个符号，然后是空行
Gson使用二（GsonBuilder） eksliang json gson GsonBuilder
转载请出自出处：http://eksliang.iteye.com/blog/2175473 一.概述 GsonBuilder用来定制java跟json之间的转换格式二.基本使用实体测试类：温馨提示：默认情况下@Expose注解是不起作用的,除非你用GsonBuilder创建Gson的时候调用了GsonBuilder.excludeField
报ClassNotFoundException: Didn't find class "...Activity" on path: DexPathList gundumw100 android
有一个工程，本来运行是正常的，我想把它移植到另一台PC上，结果报： java.lang.RuntimeException: Unable to instantiate activity ComponentInfo{com.mobovip.bgr/com.mobovip.bgr.MainActivity}: java.lang.ClassNotFoundException: Didn't f
JavaWeb之JSP指令 ihuning javaweb
要点 JSP指令简介 page指令 include指令 JSP指令简介 JSP指令（directive）是为JSP引擎而设计的，它们并不直接产生任何可见输出，而只是告诉引擎如何处理JSP页面中的其余部分。 JSP指令的基本语法格式： <%@ 指令属性名="
mac上编译FFmpeg跑ios 啸笑天 ffmpeg
1、下载文件：https://github.com/libav/gas-preprocessor，复制gas-preprocessor.pl到/usr/local/bin/下，修改文件权限：chmod 777 /usr/local/bin/gas-preprocessor.pl 2、安装yasm-1.2.0 curl http://www.tortall.net/projects/yasm
sql mysql oracle中字符串连接 macroli oracle sql mysql SQL Server
有的时候，我们有需要将由不同栏位获得的资料串连在一起。每一种资料库都有提供方法来达到这个目的： MySQL: CONCAT() Oracle: CONCAT(), || SQL Server: + CONCAT() 的语法如下： Mysql 中 CONCAT(字串1, 字串2, 字串3, ...): 将字串1、字串2、字串3，等字串连在一起。请注意，Oracle的CON
Git fatal: unab SSL certificate problem: unable to get local issuer ce rtificate qiaolevip 学习永无止境每天进步一点点 git 纵观千象
// 报错如下： $ git pull origin master fatal: unable to access 'https://git.xxx.com/': SSL certificate problem: unable to get local issuer ce rtificate // 原因：由于git最新版默认使用ssl安全验证，但是我们是使用的git未设
windows命令行设置wifi surfingll windows wifi 笔记本wifi
还没有讨厌无线wifi的无尽广告么，还在耐心等待它慢慢启动么教你命令行设置笔记本电脑wifi： 1、开启wifi命令 netsh wlan set hostednetwork mode=allow ssid=surf8 key=bb123456 netsh wlan start hostednetwork pause 其中pause是等待输入，可以去掉 2、
Linux（Ubuntu）下安装sysv-rc-conf wmlJava linux ubuntu sysv-rc-conf
安装：sudo apt-get install sysv-rc-conf 使用：sudo sysv-rc-conf 操作界面十分简洁，你可以用鼠标点击，也可以用键盘方向键定位，用空格键选择，用Ctrl+N翻下一页，用Ctrl+P翻上一页，用Q退出。背景知识 sysv-rc-conf是一个强大的服务管理程序，群众的意见是sysv-rc-conf比chkconf
svn切换环境，重发布应用多了javaee标签前缀 zengshaotao javaee
更换了开发环境，从杭州，改变到了上海。svn的地址肯定要切换的，切换之前需要将原svn自带的.svn文件信息删除，可手动删除，也可通过废弃原来的svn位置提示删除.svn时删除。然后就是按照最新的svn地址和规范建立相关的目录信息，再将原来的纯代码信息上传到新的环境。然后再重新检出，这样每次修改后就可以看到哪些文件被修改过，这对于增量发布的规范特别有用。检出