环境介绍:
master:23.247.76.253
[root@nagios_client1 tool]#
mysql -V
mysql Ver 14.14 Distrib 5.6.32, for linux-glibc2.5 (x86_64) using EditLine wrapper
[root@nagios_client1 tool]#
cat /etc/redhat-release
CentOS release 6.7 (Final)
slave:23.247.78.254
[root@nagios_client2 ~]#
mysql -V
mysql Ver 14.14 Distrib 5.6.32, for linux-glibc2.5 (x86_64) using EditLine wrapper
[root@nagios_client2 ~]#
cat /etc/redhat-release
CentOS release 6.9 (Final)
1. 设置master
修改配置文件:
vim /usr/local/mysql_2/my.cnf
在[mysqld]部分查看是否有以下内容,如果没有则添加:
server-id=1
log-bin=mysql-bin
除了这两行是必须的外,还有两个参数,你可以选择性的使用:
binlog-do-db=test3,databasename2
binlog-ignore-db=databasename1,databasename2
binlog-do-db=需要复制的数据库名,多个数据库名,使用逗号分隔。binlog-ignore-db=不需要复制的数据库库名,多个数据库名,使用逗号分隔。这两个参数其实用一个就可以了。设置的test3为我要同步的数据库。
修改后重启:
/etc/init.d/mysqld restart
Shutting down MySQL.. SUCCESS!
Starting MySQL. SUCCESS!
mysql> grant replication slave on *.* to 'repl'@'23.247.78.254' identified by '123456';
//这里的repl是为slave端设置的访问master端mysql数据的用户,密码为123456,这里的23.247.78.254为slave的ip。
mysql> flush tables with read lock; //锁定数据库,此时不允许更改任何数据
Query OK, 0 rows affected (0.00 sec)
mysql> show master status; //查看状态,这些数据是要记录的,一会要在slave端用到
+------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000002 | 330 | test3 | | |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)
设置slave:
先修改slave的配置文件my.cnf:
vim /etc/my.cnf
找到 “server-id = 1” 这一行,删除掉或者改为 “server-id = 2” 总之不能让这个id和master一样,否则会报错。另外在从上,你也可以选择性的增加如下两行,对应于主上增加的两行:
replicate-do-db=databasename1,databasename2
replicate-ignore-db=databasename1,databasename2
改完后,重启slave:
service mysqld restart
然后在slave上配置主从:
mysql> stop slave;
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> change master to master_host='23.247.76.253', master_port=3306,
-> master_user='repl', master_password='123456',
-> master_log_file='mysql-bin.000002', master_log_pos=330;
Query OK, 0 rows affected, 2 warnings (0.02 sec)
mysql> start slave;
//其中master_log_file和master_log_pos是在上面使用 show master status 查到的数据。执行完这一步后,需要在master上执行一步:
mysql> unlock tables;
Query OK, 0 rows affected (0.00 sec)
然后查看slave的状态:
mysql> show slave status\G;
遇到错误:
Slave_IO_Running: No
Slave_SQL_Running: Yes
Last_IO_Errno: 1593
Last_IO_Error: Fatal error: The slave I/O thread stops because master and slave have equal MySQL server ids; these ids must be different for replication to work (or the --replicate-same-server-id option must be used on slave but this does not always make sense; please check the manual before using it).
解决:
mysql> show variables like 'server_id'; //查看服务器的id号set global server_id=XX 修改id号
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| server_id | 1 |
+---------------+-------+
1 row in set (0.00 sec)
mysql> set global server_id=2;
Query OK, 0 rows affected (0.00 sec)
mysql> show variables like 'server_id';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| server_id | 2 |
+---------------+-------+
1 row in set (0.00 sec)
确认以下两项参数都为yes:
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
测试主从同步:
master执行:
mysql> select count(*) from t1
-> ;
+----------+
| count(*) |
+----------+
| 0 |
+----------+
1 row in set (0.00 sec)
mysql> drop table t1; //删除t1表
Query OK, 0 rows affected (0.01 sec)
mysql> select count(*) from t1;
ERROR 1146 (42S02): Table 'test3.t1' doesn't exist
slave执行:
mysql> select count(*) from t1
-> ;
+----------+
| count(*) |
+----------+
| 0 |
+----------+
1 row in set (0.00 sec)
mysql> select count(*) from t1;
ERROR 1146 (42S02): Table 'test3.t1' doesn't exist
在mster执行删除t1表,然后再slave查下t1表,提示t1表不存在。证明已经同步了master的操作;
主从配置起来很简单,但是这种机制也是非常脆弱的,一旦我们不小心在从上写了数据,那么主从也就被破坏了。另外如果重启master,务必要先把slave停掉,也就是说需要在slave上去执行stop slave 命令,然后再去重启master的mysql服务,否则很有可能就会中断了。当然重启完后,还需要把slave给开启 start slave。
nagios 监控 mysql 主从同步状态
slave查询
[root@nagios_client2 ~]# mysql -uroot -p"123456" -e "show slave status\G"|grep "Running:"
Warning: Using a password on the command line interface can be insecure.
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
"Slave_IO_Running: Yes“和“Slave_SQL_Running: Yes”,这两个值全是"Yes"就表明主从库同步成功
命令:
[root@nagios_client2 ~]#
slave_status=($(mysql -uroot -p"123456" -e "show slave status\G"|grep Running |awk '{print $2}'))
Warning: Using a password on the command line interface can be insecure.
[root@nagios_client2 ~]#
echo ${slave_status[0]}
Yes
[root@nagios_client2 ~]#
echo ${slave_status[1]}
Yes
查看检查脚本:
cat /usr/local/nagios/libexec/check_mysql_slave
#!/bin/sh slave_status=($(mysql -uroot -p"123456" -e "show slave status\G"|grep Running |awk '{print $2}')) if [ "${slave_status[0]}" = "Yes" -a "${slave_status[1]}" = "Yes" ] then echo "OK nagios_client2-slave is running" exit 0 else echo "Critical nagios_client2-slave is error" exit 2 fi
在nrpe.cfg文件里加:
vi /usr/local/nagios/etc/nrpe.cfg
command[check_mysql_slave]=/usr/local/nagios/libexec/check_mysql_slave
重启nrpe:
pkill nrpe
/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
服务器端的配置:
[root@nagios_server objects]#
/usr/local/nagios/libexec/check_nrpe -H 23.247.78.254 -c check_mysql_slave
OK nagios_client2-slave is running
编辑mysql服务监控配置文件
vi /usr/local/nagios/etc/services/mysql.cfg
加上:
define service {
use generic-service
host_name nagios_client2
service_description check_mysql_replication_status
check_command check_nrpe!check_mysql_slave
max_check_attempts 2
normal_check_interval 2
retry_check_interval 2
check_period 24x7
notification_interval 10
notification_period 24x7
notification_options w,u,c,r
contact_groups admins
process_perf_data 1
}
加到mysql服务组:vi servicegroups.cfg
members nagios_client1,port_3306,nagios_client2,port_3306,nagios_client2,check_mysql_replication_status
/etc/init.d/nagios checkconfig #检测配置文件
[root@nagios_server services]#
/etc/init.d/nagios reload #重新加载配置文件
最后结果: