为什么80%的码农都做不了架构师?>>>
mysql> SELECT DISTINCT
-> CONCAT(t.table_schema,'.',t.table_name) as tbl,
-> t.engine,
-> IF(ISNULL(c.constraint_name),'NOPK','') AS nopk,
-> IF(s.index_type = 'FULLTEXT','FULLTEXT','') as ftidx,
-> IF(s.index_type = 'SPATIAL','SPATIAL','') as gisidx
-> FROM information_schema.tables AS t
-> LEFT JOIN information_schema.key_column_usage AS c
-> ON (t.table_schema = c.constraint_schema AND t.table_name = c.table_name
-> AND c.constraint_name = 'PRIMARY')
-> LEFT JOIN information_schema.statistics AS s
-> ON (t.table_schema = s.table_schema AND t.table_name = s.table_name
-> AND s.index_type IN ('FULLTEXT','SPATIAL'))
-> WHERE t.table_schema NOT IN ('information_schema','performance_schema','mysql')
-> AND t.table_type = 'BASE TABLE'
-> AND (t.engine <> 'InnoDB' OR c.constraint_name IS NULL OR s.index_type IN ('FULLTEXT','SPATIAL'))
-> ORDER BY t.table_schema,t.table_name;
+--------------+--------+------+-------+--------+
| tbl | engine | nopk | ftidx | gisidx |
+--------------+--------+------+-------+--------+
| nd.test | InnoDB | NOPK | | |
| test.example | MyISAM | | | |
| xuhh.test_2 | MyISAM | | | |
+--------------+--------+------+-------+--------+
二.wsrep API(WriteSet Replication) Wsrep API定义了一系列应用回调和复制调用库,来实现事务数据库同步写集(writeset)复制以及相似应用。目的在于从应用细节上实现抽象的,隔离的复制。虽然这个接口的主要目标是基于认证的多主复制,但同样适用于异步和同步的主从复制。 应用回调提供管理事务写集: 1.在发送端,填充写集 2.在接收端,应用写集 接收端通过高优先级事务来应用写集。来确保集群中的每个数据库服务器以相同的顺序,不间断的处理SQL语句,以保持数据的一致性。 三.GTID(Global Transaction ID) wsrep API描述了下面的复制模型。应用程序比如数据库服务器有一个被客户修改过的状态比如是数据库内容。这种改变的状态表示为一系列的原子变化。集群中的所有节点通过同步复制具有相同的状态并以相同的顺序应用这些状态。 wsrep API引入GTID目的: 1.标识状态的改变 2.标识状态本身最后状态改变ID GTID组成部分: 1.状态UUID,唯一性的状态标识和序列变化经历 2.有序的序列号来表示序列中的位置变化 因此,GTID允许比较应用程序的状态,建立有序的状态变化,判断变化是否应用和是否所有节点都有应用。GTID像这样的45eec521-2f34-11e0-0800-2a36050b826b:94530586304。 四.Galera wsrep提供者 Galera是一个wsrep提供者,实现真正的多主几乎同步复制。主要功能有: 1.几乎同步:一个节点上提交的事务确保集群内其他所有节点都提交。没有提交的事务是在某个节点发生故障时丢失。 2.真正的多主:在集群所有节点上可以同时修改同一个表。因此无需主节点故障转移。 3.真正的并行应用:从0.8版本开始 4.没有单点故障 五.Galera集群 Galera集群建立在专有组通讯系统层之上的来实现虚拟同步QoS.虚拟同步统一数据传输和集群成员服务,使的消息传递语义明确。还提供从多个来源的消息总序,非常方便的建立多主集群全局事务ID。Galera集群传输层是一个完整的对称的无向图,因此每个节点由一个单一的TCP连接集群的其他每个节点。默认情况下TCP用于消息复制和集群成员服务,从0.8版本开始,UDP组播可以用于LAN复制。 六.节点存活检测 Galera集群每个节点通过keepalive消息来监控其他节点存活。可以调整keepalive的检测频率和灵敏度来忍受不可靠的网络。 七.主组件(Primary Component) 除了单节点故障,由于网络故障集群可以分割成几个组件。一个组件是一组节点彼此连接不是来自其他组件的节点。在这种情况下只有一个组件可以续集修改状态以避免历史分歧。这就是PC。当集群分区时,Galera调用特殊群体算法选择一个PC来确保集群只有一个PC。 八.脑裂(Split-brain) 像任何基于法定人数系统,当选择PC仲裁算法失败时,Galera集群发生了脑裂。这是可能发生的,例如,在一个没有备份交换机集群中,如果主交换机故障了。最有可能发生脑裂情景是在两个节点的集群中当有单个节点故障时。因此强烈建议Galera集群最低配置3个节点。 九.流量控制 虚拟同步保证一致性,但是没有时间同步,流畅的多主操作是普遍的期望和要求。为了此目的,Galera实现其自己的时间流量控制来保持节点在下一秒同步。可在运行时配置,放宽的主从设置。 十.多主操作 Galera支持多主操作,在这个意义上,所有节点可以同时更改同一个表,并能仍然保持一致性。Galera通过认证算法来检测冲突和回滚事务。多主复制的效率取决于冲突率,而这又依赖于应用程序的负载分布,负载均衡策略和集群节点数。在某些情况下,减少节点数量,可以同时接受写入,可以提高性能。 十一.并行应用 由于galera可以检测writesets冲突,还能检测那些writesets可以同时应用。 十二. 配置参数 Parameter Default Description protonet.backend asio 传输后端使用。当前只支持ASIO protonet.version 0 socket.ssl_cert 指定SSL验证路径 socket.ssl_key 指定私钥路径 socket.ssl_compression yes SSL连接是否压缩 socket.ssl_cipher “AES128-SHA” 对称密码使用 gmcast.listen_addr tcp://0.0.0.0:4567 Galera监听地址,用来与其他节点连接 gmcast.mcast_addr 如果设置,UDP组播将用于复制。gmcast.mcast_addr=239.192.0.11。所有节点要设置为一样 gmcast.mcast_ttl 1 组播报存活时间 gmcast.peer_timeout PT3S 初始化消息的中继连接超时 gmcast.time_wait PT5S Time to wait until allowing peer declared outside of stable view to reconnect. gmcast.version 0 evs.causal_keepalive_period 供开发人员使用。默认evs.keepalive_period。 evs.consensus_timeout PT30S 集群成员达成共识超时时间 evs.debug_log_mask 0x1 控制EVS调试日志记录,当wsrep_debug开启时有效 evs.inactive_check_period PT1S 检测间隔时间 evs.inactive_timeout PT15S 节点多少间隔无响应,宣告down evs.info_log_mask 0 控制额外的的EVS信息日志。0X1 - 额外的视图变化信息;0X2 - 额外的状态变化信息;0x4 - 统计;0x8 - 剖析(仅适用于构建与分析启用) evs.install_timeout PT15S 等待安装消息确认超时时间 evs.join_retrans_period PT1S 发送EVS加入消息的频率 evs.keepalive_period PT1S 发送存活消息频率 evs.max_install_timeouts 1 尝试多少次后放弃 evs.send_window 4 同时复制最大数据包。对于WAN可以设置高些,如512.必须不低于evs.user_send_window evs.stats_report_period PT1M 控制EVS统计报告期 evs.suspect_timeout PT5S 非活动期之后,节点将是死的。如果所有剩余节点上达成一致,在到达evs.inactive_timeout前将被丢弃 evs.use_aggregate true 聚合小的数据包 evs.user_send_window 2 同时复制的最大数据包。对于WAN可以设置高些,如512 evs.view_forget_timeout PT5M Drop past views from view history after that timeout. evs.version 0 pc.bootstrap 将它设置为true,将NON-PRIMARY组件转变为PRIMARY pc.checksum true 复制信息校验和 pc.ignore_sb false 即使在脑裂情况下,允许处理更新吗?在多主下这是一个危险的设置,应简化为主从集群,尤其是适用2个节点 pc.ignore_quorum false 完全无视法定人数计算 pc.linger PT2S PC控制等待EVS中止间隔 pc.npvo false 如果设置为true,在冲突情况下更近的PC会覆盖旧的 pc.wait_prim false 如果设置为true,节点将永远等待PC。用来绑定一个NON-PRIMARY组件。pc.bootstrap来设置PC pc.weight 1 2.4版本,节点权重用来仲裁 pc.version 0 gcs.fc_debug 0 调试统计关于SST流控制的writesets gcs.fc_factor 0.5 Resume replication after recv queue drops below that fraction of gcs.fc_limit. gcs.fc_limit 16 如果接受队列writesets超过将暂停复制 gcs.fc_master_slave NO 假定组中只有一个master gcs.max_packet_size 32616 所有的writesets超过该值,将成碎片 gcs.max_throttle 0.25 在转移过程中节流多少复制率(避免内存耗尽)。为了完成状态转移,如果停止复制可以接受的,设置该值为0.0 gcs.recv_q_hard_limit LLONG_MAX 允许最大的recv队列。通常为(RAM+SWAP)/2。如果超过此限制,服务器将中止 gcs.recv_q_soft_limit 0.25 gcs.sync_donor NO 集群内其他成员与donor保持同步?”yes”意味着如果donor阻塞状态转移,整个集群将阻塞它 ist.recv_addr 增量状态转移监听。默认为:query_cache_size=0
binlog_format=ROW
default_storage_engine=innodb
innodb_autoinc_lock_mode=2
# innodb_doublewrite=1 - this is the default and it should stay this way
# 2. Optional mysqld settings: your regular InnoDB tuning and such
datadir=/mnt/mysql/data
innodb_buffer_pool_size=28G
innodb_log_file_size=100M
innodb_file_per_table
innodb_flush_log_at_trx_commit=2
# 3. wsrep provider configuration: basic wsrep options
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_provider_options="gcache.size=32G; gcache.page_size=1G"
wsrep_cluster_address=gcomm://192.168.0.1,192.168.0.2,192.168.0.3
wsrep_cluster_name='my_galera_cluster'
wsrep_node_address='192.168.0.2'
wsrep_node_name='node2'
wsrep_sst_method=xtrabackup
wsrep_sst_auth=root:rootpa$$ // rsync-based SST不需要此配置
# 4. additional "frequently used" wsrep settings
wsrep_node_incoming_address='192.168.10.2'
wsrep_sst_donor='node3'
wsrep_slave_threads=16 //4*CPU, wsrep_cert_deps_distance值来评估多少事件可并行处理
配置技巧: 1.并行应用需要配置以下两个参数 innodb_autoinc_lock_mode=2 innodb_locks_unsafe_for_binlog=1 2.WAN复制 wsrep_provider_options = "evs.keepalive_period = PT3S; evs.inactive_check_period = PT10S; evs.suspect_timeout = PT30S; evs.inactive_timeout = PT1M; evs.install_timeout = PT1M" 3. Multi-Master 客户端同时写操作多主,认证冲突的概率就高,可能会引起不良的回滚操作和性能衰减。在这种情况下,就应减少主。 4. Master-Slave wsrep_provider_options = "gcs.fc_limit = 256; gcs.fc_factor = 0.99; gcs.fc_master_slave = yes" 减少流量控制事件率这可能在一定程度上提高复制性能。 二十一. 监控
mysql> SHOW STATUS LIKE 'wsrep_%';
二十二. MySQL/Galera集群数据备份 Galera集群的备份和普通的MySQL备份相同。所有节点都是一样的,备份一个节点就相当于备份整个集群。但是会有两个潜在的问题: 1.没有与它们相关联的全局事务ID,是很好的恢复数据,但是不能用来恢复Galera节点。 2.在备份期间可能会阻塞整个集群的操作 在备份时,要关联全局事务ID和避免影响集群,可通过: # /usr/bin/garbd --address gcomm://10.1.1.182
10.1.1.191
10.1.1.202
# rpm -Uhv http://www.percona.com/downloads/percona-release/percona-release-0.0-1.x86_64.rpm
# yum install percona-xtrabackup.x86_64
# wget https://launchpad.net/galera/2.x/23.2.4/+download/galera-23.2.4-1.rhel5.x86_64.rpm
# rpm -ivh galera-23.2.4-1.rhel5.x86_64.rpm
# wget https://launchpad.net/codership-mysql/5.5/5.5.29-23.7.3/+download/mysql-5.5.29_wsrep_23.7.3-linux-x86_64.tar.gz
# tar zxvf mysql-5.5.29_wsrep_23.7.3-linux-x86_64.tar.gz
【10.1.1.182】 [mysqld]
# 1. Mandatory settings: these settings are REQUIRED for proper cluster operation
query_cache_size=0
binlog_format=ROW
default_storage_engine=innodb
innodb_autoinc_lock_mode=2
# innodb_doublewrite=1 - this is the default and it should stay this way
# 2. Optional mysqld settings: your regular InnoDB tuning and such
datadir=/data/wsrep/3306
innodb_data_file_path = ibdata1:128M;ibdata2:10M:autoextend
innodb_buffer_pool_size=2G
innodb_log_file_size=100M
innodb_file_per_table
innodb_flush_log_at_trx_commit=2
innodb_locks_unsafe_for_binlog=1
# 3. wsrep provider configuration: basic wsrep options
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_provider_options="gcache.size=32G; gcache.page_size=1G"
wsrep_cluster_address=gcomm://
wsrep_cluster_name='my_galera_cluster'
wsrep_node_address='10.1.1.182'
wsrep_node_name='node-182'
wsrep_sst_method=xtrabackup
wsrep_sst_auth=sst:rootpa$$
# /etc/init.d/mysqld start
SST使用xtrabackup所需的用户权限
mysql> CREATE USER 'sst'@'localhost' IDENTIFIED BY 'rootpa$$';
mysql> GRANT RELOAD, LOCK TABLES, REPLICATION CLIENT ON *.* TO 'sst'@'localhost';
【10.1.1.191】 [mysqld]
server_id=191
# 1. Mandatory settings: these settings are REQUIRED for proper cluster operation
query_cache_size=0
binlog_format=ROW
default_storage_engine=innodb
innodb_autoinc_lock_mode=2
# innodb_doublewrite=1 - this is the default and it should stay this way
# 2. Optional mysqld settings: your regular InnoDB tuning and such
datadir=/data/wsrep/3306
innodb_data_file_path = ibdata1:128M;ibdata2:10M:autoextend
innodb_buffer_pool_size=2G
innodb_log_file_size=100M
innodb_file_per_table
innodb_flush_log_at_trx_commit=2
innodb_locks_unsafe_for_binlog=1
# 3. wsrep provider configuration: basic wsrep options
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_provider_options="gcache.size=32G; gcache.page_size=1G"
wsrep_cluster_address=gcomm://10.1.1.182,10.1.1.191,10.1.1.202
wsrep_cluster_name='my_galera_cluster'
wsrep_node_address='10.1.1.191'
wsrep_node_name='node-191'
wsrep_sst_method=xtrabackup
wsrep_sst_auth=sst:rootpa$$
【10.1.1.202】 [mysqld]
# 1. Mandatory settings: these settings are REQUIRED for proper cluster operation
query_cache_size=0
binlog_format=ROW
default_storage_engine=innodb
innodb_autoinc_lock_mode=2
# innodb_doublewrite=1 - this is the default and it should stay this way
# 2. Optional mysqld settings: your regular InnoDB tuning and such
datadir=/data/wsrep/3306
innodb_data_file_path = ibdata1:128M;ibdata2:10M:autoextend
innodb_buffer_pool_size=2G
innodb_log_file_size=100M
innodb_file_per_table
innodb_flush_log_at_trx_commit=2
innodb_locks_unsafe_for_binlog=1
# 3. wsrep provider configuration: basic wsrep options
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_provider_options="gcache.size=32G; gcache.page_size=1G"
wsrep_cluster_address=gcomm://10.1.1.182,10.1.1.202,10.1.1.191
wsrep_cluster_name='my_galera_cluster'
wsrep_node_address='10.1.1.202'
wsrep_node_name='node-202'
wsrep_sst_method=xtrabackup
wsrep_sst_auth=sst:rootpa$$
mysql> show variables like "wsrep_cluster_address";
+-----------------------+----------------------+
| Variable_name | Value |
+-----------------------+----------------------+
| wsrep_cluster_address | gcomm://10.1.1.182 |
+-----------------------+----------------------+
1 row in set (0.00 sec)
mysql> set global wsrep_cluster_address='gcomm://10.1.1.182,10.1.1.191,10.1.1.202';
Query OK, 0 rows affected (3.01 sec)
mysql> show variables like "wsrep_cluster_address";
+-----------------------+-------------------------------------------------+
| Variable_name | Value |
+-----------------------+-------------------------------------------------+
| wsrep_cluster_address | gcomm://10.1.1.182,10.1.1.191,10.1.1.202 |
+-----------------------+-------------------------------------------------+
1 row in set (0.00 sec)
测试复制:
mysql@node-202> create database xuhh;
Query OK, 1 row affected (0.00 sec)
mysql@node-182> use xuhh;
Database changed
mysql@node-182> create table test_1 (node_id int primary key auto_increment, node_name varchar(10));
Query OK, 0 rows affected (0.03 sec)
mysql@node-191> insert into xuhh.test_1 values (1,'xuhh');
Query OK, 1 row affected (0.00 sec)
mysql@node-191> insert into xuhh.test_1 values (2,'xuhh');
Query OK, 1 row affected (0.00 sec)
mysql@node-202> select * from xuhh.test_1;
+---------+-----------+
| node_id | node_name |
+---------+-----------+
| 1 | xuhh |
| 2 | xuhh |
+---------+-----------+
2 rows in set (0.00 sec)
测试myisam引擎表:
mysql@node-182> create table test_2 (node_id int primary key auto_increment, node_name varchar(10)) engine=myisam;
Query OK, 0 rows affected (0.00 sec)
mysql@node-191> insert into xuhh.test_2 values (1,'myisam');
Query OK, 1 row affected (0.00 sec)
mysql@node-191> insert into xuhh.test_2 values (2,'myisam');
Query OK, 1 row affected (0.00 sec)
mysql@node-202> select * from xuhh.test_2;
Empty set (0.00 sec)
mysql@node-182> select * from xuhh.test_2;
Empty set (0.00 sec)
Myisam表引擎的内容没有复制过来。
mysql@node-202> set global wsrep_replicate_myisam=1;
Query OK, 0 rows affected (0.00 sec)
mysql@node-182> set global wsrep_replicate_myisam=1;
Query OK, 0 rows affected (0.00 sec)
mysql@node-191> set global wsrep_replicate_myisam=1;
Query OK, 0 rows affected (0.00 sec)
mysql@node-202|182> insert into xuhh.test_2 values (1,'myisam');
Query OK, 1 row affected (0.00 sec)
mysql@node-202|182> insert into xuhh.test_2 values (5,'myisam_test');
Query OK, 1 row affected, 1 warning (0.00 sec)
mysql@node-191> select * from xuhh.test_2;
+---------+------------+
| node_id | node_name |
+---------+------------+
| 1 | myisam |
| 2 | myisam |
| 5 | myisam_tes |
+---------+------------+
3 rows in set (0.00 sec)
设置wsrep_replicate_myisam=1后,myisam引擎表是可以复制的。node-191会报错 [ERROR] Slave SQL: Error 'Duplicate entry '1' for key 'PRIMARY'' on query. Default database: 'xuhh'. Query: 'insert into xuhh.test_2 values (1,'myisam')', Error_code: 1062 [Warning] WSREP: RBR event 1 Query apply warning: 1, 14 [Warning] WSREP: Ignoring error for TO isolated action: source: 1f0a4ece-ccf9-11e2-0800-bd825e760a16 version: 2 local: 0 state: APPLYING flags: 65 conn_id: 203 trx_id: -1 seqnos (l: 26, g: 14, s: 13, d: 13, ts: 1370396261855962000) HAProxy配置:
# rpm -Uvh http://mirrors.yun-idc.com/epel/5/i386/epel-release-5-4.noarch.rpm
# yum install haproxy.x86_64
# vim haproxy.cfg
global
log 127.0.0.1 local2
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
defaults
mode http
log global
option dontlognull
option httpclose
#option httplog
option tcplog
#option forwardfor
option redispatch
timeout connect 10000 # default 10 second time out if a backend is not found
timeout client 300000
timeout server 300000
maxconn 60000
retries 3
frontend stats-front
bind *:8000
mode http
default_backend stats-back
frontend db-cluster-front
bind *:3313
mode tcp
default_backend db-cluster-back
frontend db-cluster-onenode-front
bind *:4306
mode tcp
default_backend db-cluster-onenode-back
backend stats-back
mode http
balance roundrobin
stats uri /haproxy/stats
stats auth xuhh:xuhh123
backend db-cluster-back
mode tcp
balance leastconn
option httpchk
server node-202 10.1.1.202:3306 check port 9200 inter 12000 rise 3 fall 3
server node-182 10.1.1.182:3306 check port 9200 inter 12000 rise 3 fall 3
server node-191 10.1.1.191:3306 check port 9200 inter 12000 rise 3 fall 3
backend db-cluster-onenode-back
mode tcp
balance leastconn
option httpchk
server node-202 10.1.1.202:3306 check port 9200 inter 12000 rise 3 fall 3
server node-182 10.1.1.182:3306 check port 9200 inter 12000 rise 3 fall 3 backup
server node-191 10.1.1.191:3306 check port 9200 inter 12000 rise 3 fall 3 backup
在每个节点上执行以下操作:
mysql> grant process on *.* to ‘clustercheck’@’127.0.0.1’ identified by ‘clusterpassword!’;
# yum install xinetd # vim /etc/services
mysqlchk 9200/tcp # mysqlchk
# vim /etc/xinetd.d/mysqlchk
# default: on
# description: mysqlchk
service mysqlchk
{
# this is a config for xinetd, place it in /etc/xinetd.d/
disable = no
flags = REUSE
socket_type = stream
bind = 10.1.1.191
port = 9200
wait = no
user = nobody
server = /usr/bin/clustercheck
log_on_failure += USERID
only_from = 10.0.0.0/8
# recommended to put the IPs that need
# to connect exclusively (security purposes)
per_source = UNLIMITED
}
# vim /usr/bin/clustercheck
#!/bin/bash
if [[ $1 == '-h' || $1 == '--help' ]];then
echo "Usage: $0 "
exit
fi
MYSQL_USERNAME="${1:-clustercheck}"
MYSQL_PASSWORD="${2:-clustercheckpassword!}"
MYSQL_HOST="127.0.0.1"
MYSQL_PORT="3306"
AVAILABLE_WHEN_DONOR=${3:-0}
ERR_FILE="${4:-/dev/null}"
#Timeout exists for instances where mysqld may be hung
TIMEOUT=10
WSREP_STATUS=`mysql -nNE --connect-timeout=$TIMEOUT --user=${MYSQL_USERNAME} --password=${MYSQL_PASSWORD} --host=${MYSQL_HOST} --port=${MYSQL_PORT} -e "SHOW STATUS LIKE 'wsrep_local_state';" 2>${ERR_FILE} | tail -1 2>>${ERR_FILE}`
WSREP_STATUS=${WSREP_STATUS: -1}
if [[ "${WSREP_STATUS}" == "4" ]] || [[ "${WSREP_STATUS}" == "2" && ${AVAILABLE_WHEN_DONOR} == 1 ]]
then
# Percona XtraDB Cluster node local state is 'Synced' => return HTTP 200
# Shell return-code is 0
echo -en "HTTP/1.1 200 OK\r\n"
echo -en "Content-Type: text/plain\r\n"
echo -en "Connection: close\r\n"
echo -en "Content-Length: 40\r\n"
echo -en "\r\n"
echo -en "MySQL/Galera Cluster Node is synced.\r\n"
exit 0
else
# Percona XtraDB Cluster node local state is not 'Synced' => return HTTP 503
# Shell return-code is 1
echo -en "HTTP/1.1 503 Service Unavailable\r\n"
echo -en "Content-Type: text/plain\r\n"
echo -en "Connection: close\r\n"
echo -en "Content-Length: 44\r\n"
echo -en "\r\n"
echo -en "MySQL/Galera Cluster Node is not synced.\r\n"
exit 1
fi
二十五. 备份与还原
# innobackupex-1.5.1 --defaults-file=/data/3306/my.cnf --galera-info --user=root --socket=/tmp/mysql.sock --password=$PASSWORD /backup/mysql/3306
转载请注明出处:http://www.ttlsa.com/html/1551.html