原理讲述:
AB复制主要是通过两个slave进程(Sql和I/O进程)和Master的I/O进程完成的
复制过程主要是Slave从Master端获取该日志然后再在自己身上完全顺序的执行日志中所记录的各种操作
复制过程三部曲:
1)Slave启动I/O进程连接Master,并请求从指定日志文件的指定位置(或者从最开始的日志)之后的日志内容
2)Master接到请求后通过负责复制的IO进程将Master端的bin-log文件的名称 bin-log的位置以及日志信息返回给Slave
3)Slave收到信息后将接收到的日志内容依次添加到Slave端的relay-log文件的最末端,并将读取到的Master端的
bin-log的文件名和位置记录到master-info文件中,以便在下一次读取的时候能够清楚的告诉Master“我需要从
某个bin-log的哪个位置开始往后的日志内容,请发给我”
Slave的Sql进程检测到relay-log中新增加了内容后,会马上解析relay-log的内容成为在Master端真实执行时候
的那些可执行的内容,并在自身执行
环境描述:最好两台机器的mysql版本完全相同
A:211.100.97.246 Linux x86_64 mysql5.1.56
B:211.100.97.250 Linux x86_64 mysql5.1.56
启动mysql进程
A 和 B均启动mysql进程
修改安全级别
关闭selinux,iptables允许两台机器之间的mysql端口互连
可以在/etc/sysconfig/selinux中设置参数selinux= disabled。
添加iptables -A INPUT -s SourceIP -p tcp --dport 3306 -j ACCEPT
修改完测试一下端口:
A: telnet B_IP 3306
B: telnet A_IP 3306
创建账户
A: useradd repl1
B: useradd repl2
添加完查看账户信息
A: id repl1
B: id repl2
A:mysql配置文件
user=mysql
log-bin=mysql-bin
server-id = 1
binlog-do-db=test
binlog-ignore-db=mysql
replicate-do-db=test
replicate-ignore-db=mysql
log-slave-updates
slave-skip-errors=all
sync_binlog=1
auto_increment_increment=2
auto_increment_offset=1
B:mysql配置文件
user=mysql
log-bin=mysql-bin
server-id = 2
binlog-do-db=test
binlog-ignore-db=mysql
replicate-do-db=test
replicate-ignore-db=mysql
log-slave-updates
slave-skip-errors=all
sync_binlog=1
auto_increment_increment=2
auto_increment_offset=1
说明:
server-id表示本机的序列号,如果为1的话一般代表master
binlog-do-db 表示需要备份哪个数据库,如果要备份多个数据库,应该添加多条记录
replicate-do-db 表示要同步的那个数据库
log-bin 表示开启binlog日志功能,打开该选项才可以通过I/O进程将mater上的日志信息写入到Slave的relay-log
auto_increment_increment定义下一次AUTO_INCREMENT的步长
auto_increment_offset 定义AUTO_INCREMENT的起点值
授权用户【至少赋予FILE,SELECT,REPLICATION SLAVE权限】
A:允许B通过repl2账户与A同步数据
mysql> grant replication client on *.* to 'repl2'@'B_IP' identified by 'PASSWD';
mysql> flush privileges;
查看一下授权情况:
mysql> select * from mysql.user where host='repl1'@'B_IP'\G ;
*************************** 1. row ***************************
Host: 211.100.97.250
User: repl2
Password: *6BB4837EB74329105EE4568DDA7DC67ED2CA2AD9
Select_priv: Y
Insert_priv: Y
Update_priv: Y
Delete_priv: Y
Create_priv: Y
Drop_priv: Y
Reload_priv: Y
Shutdown_priv: Y
Process_priv: Y
File_priv: Y
Grant_priv: Y
References_priv: Y
Index_priv: Y
Alter_priv: Y
Show_db_priv: Y
Super_priv: Y
Create_tmp_table_priv: Y
Lock_tables_priv: Y
Execute_priv: Y
Repl_slave_priv: Y
Repl_client_priv: Y
Create_view_priv: Y
Show_view_priv: Y
Create_routine_priv: Y
Alter_routine_priv: Y
Create_user_priv: Y
Event_priv: Y
Trigger_priv: Y
ssl_type:
ssl_cipher:
x509_issuer:
x509_subject:
max_questions: 0
max_updates: 0
max_connections: 0
max_user_connections: 0
1 row in set (0.00 sec)
B:允许A通过repl1账户与B同步数据
mysql> grant replication client on *.* to 'repl1'@'A_IP' identified by 'PASSWD';
mysql> flush privileges;
mysql> select * from mysql.user where host='repl1'@'A_IP'\G;
*************************** 1. row ***************************
Host: 211.100.97.246
User: repl1
Password: *6BB4837EB74329105EE4568DDA7DC67ED2CA2AD9
Select_priv: Y
Insert_priv: Y
Update_priv: Y
Delete_priv: Y
Create_priv: Y
Drop_priv: Y
Reload_priv: Y
Shutdown_priv: Y
Process_priv: Y
File_priv: Y
Grant_priv: N
References_priv: Y
Index_priv: Y
Alter_priv: Y
Show_db_priv: Y
Super_priv: Y
Create_tmp_table_priv: Y
Lock_tables_priv: Y
Execute_priv: Y
Repl_slave_priv: Y
Repl_client_priv: Y
Create_view_priv: Y
Show_view_priv: Y
Create_routine_priv: Y
Alter_routine_priv: Y
Create_user_priv: Y
ssl_type:
ssl_cipher:
x509_issuer:
x509_subject:
max_questions: 0
max_updates: 0
max_connections: 0
max_user_connections: 0
1 row in set (0.00 sec)
授权以后需要测试
A: /usr/local/mysql/bin/mysql -h'B_IP' -urepl1 -p
B: /usr/local/mysql/bin/mysql -h'A_IP' -urepl2 -p
两台机器上均重启mysql
killall mysqld
ps aux |grep mysql
/usr/local/mysql/bin/mysqld_safe &
ps aux |grep mysql
进入MYSQL的SHELL
/usr/local/mysql/bin/mysql -uroot -p
A:
服务器锁表(锁表状态下不能终止mysql进程,否则会失败)
mysql> flush tables with read lock\G;
Query OK, 0 rows affected (0.01 sec)
----------------
查看 A 服务器主机状态(记录二进制开始文件,位置)
mysql> show master status\G;
*************************** 1. row ***************************
File: mysql-bin.000005
Position: 106
Binlog_Do_DB: test
Binlog_Ignore_DB: mysql
1 row in set (0.00 sec)
----------------
修改A服务器配置
mysql> change master to
-> master_host='211.100.97.250',
-> master_user='repl2',
-> master_password='123456',
-> master_log_file='mysql-bin.000014',
-> master_log_pos=98;
Query OK, 0 rows affected (0.01 sec)
说明:
master_host表示主机B(250)是A(246)的master
master_user表示允许A(246)上的账户repl1连接到master进行复制,建议两台主机的授权用户和密码完全相同。
master_password 表示授权用户repl1的密码
master_log_file 表示master上日志文件的名称
master_log_pos 表示日志文件的位置
----------------
mysql> slave stop;
mysql> change master to master_host='B_IP', master_user='repl1', master_password='123456', master_log_file='mysql-bin.000001', master_log_pos=106;
然后启动slave
mysql> slave start;
启动之后查看slave的状态
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 211.100.97.250
Master_User: repl1
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000001
Read_Master_Log_Pos: 106
Relay_Log_File: XKWB5510-relay-bin.000002
Relay_Log_Pos: 251
Relay_Master_Log_File: mysql-bin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB: test
Replicate_Ignore_DB: mysql
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 106
Relay_Log_Space: 409
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
1 row in set (0.00 sec)
查看相关进程
mysql> show processlist\G;
*************************** 1. row ***************************
Id: 4
User: root
Host: localhost
db: NULL
Command: Query
Time: 0
State: NULL
Info: show processlist
*************************** 2. row ***************************
Id: 18
User: system user
Host:
db: NULL
Command: Connect
Time: 100
State: Waiting for master to send event
Info: NULL
*************************** 3. row ***************************
Id: 19
User: system user
Host:
db: NULL
Command: Connect
Time: 100
State: Has read all relay log; waiting for the slave I/O thread to update it
Info: NULL
*************************** 4. row ***************************
Id: 21
User: repl2
Host: 211.100.97.250:34536
db: NULL
Command: Binlog Dump
Time: 19
State: Has sent all binlog to slave; waiting for binlog to be updated
Info: NULL
4 rows in set (0.00 sec)
----------------------
同步两个数据库的基础库
----------------
解锁服务器
mysql> unlock tables;
----------------
mysql> use test;
mysql> show tables;
Empty set (0.00 sec)
----------------
mysql> create table t11_replicas
-> (id int not null auto_increment primary key,
-> str varchar(255) not null) engine myisam;
Query OK, 0 rows affected (0.00 sec)
mysql> insert into t11_replicas(str) values
-> ('This is a master to master test table');
Query OK, 1 row affected (0.00 sec)
----------------
mysql> show tables;
----------------
mysql> select * from t11_replicas;
------------------------------------------------------------------------------------------------
B:
mysql> show master status\G;
*************************** 1. row ***************************
File: mysql-bin.000014
Position: 98
Binlog_Do_DB: test
Binlog_Ignore_DB: mysql
1 row in set (0.00 sec)
----------------
mysql> stop slave;
mysql> change master to master_host='A_IP', master_user='repl2', master_password='123456', master_log_file='mysql-bin.000005',master_log_pos=106;
mysql> start slave;
----------------
mysql> show processlist\G;
*************************** 1. row ***************************
Id: 3
User: root
Host: localhost
db: NULL
Command: Query
Time: 0
State: NULL
Info: show processlist
*************************** 2. row ***************************
Id: 15
User: repl1
Host: 211.100.97.246:51840
db: NULL
Command: Binlog Dump
Time: 101
State: Has sent all binlog to slave; waiting for binlog to be updated
Info: NULL
*************************** 3. row ***************************
Id: 16
User: system user
Host:
db: NULL
Command: Connect
Time: 20
State: Waiting for master to send event
Info: NULL
*************************** 4. row ***************************
Id: 17
User: system user
Host:
db: NULL
Command: Connect
Time: 20
State: Has read all relay log; waiting for the slave I/O thread to update it
Info: NULL
4 rows in set (0.00 sec)
----------------
mysql> show slave status\G;
----------------
mysql> use test;
Database changed
mysql> unlock tables;
Query OK, 0 rows affected (0.00 sec)
----------------
重置日志
mysql> reset master;
mysql> show master status\G;
*************************** 1. row ***************************
File: mysql-bin.000001
Position: 106
Binlog_Do_DB: test
Binlog_Ignore_DB: mysql
1 row in set (0.00 sec)
[root@XKWB5510 ~]# ls -l /var/mysql/database/data/
total 0
查看data目录下是否有报错文件:
[root@XKWB5510 data]# ls /var/mysql/database/data/
重新执行change master to 命令以后,再启动slave,再看一下slave的状态,I/O进程起来了
进程都起来之后,就实施监控
-------------------------------------------------------------------------------------------------------------
报错:
1)
change master导致的:
Last_IO_Error: error connecting to master 'repl1@A_IP:3306' - retry-time: 60 retries
2)
在没有解锁的情况下停止slave进程:
mysql> stop slave;
ERROR 1192 (HY000): Can't execute the given command because you have active locked tables or an active transaction
3)
change master语法错误,落下逗号
mysql> change master to
-> master_host='211.100.97.250'
-> master_user='repl2',
-> master_password='123456',
-> master_log_file='mysql-bin.000002',
-> master_log_pos=106;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'master_user='repl2',
master_password='123456',
master_log_file='mysql-bin.000002' at line 3
4)
在没有停止slave进程的情况下change master
mysql> change master to master_host='211.100.97.246', master_user='repl1', master_password='123456', master_log_file='mysql-bin.000001',master_log_pos=106;
ERROR 1198 (HY000): This operation cannot be performed with a running slave; run STOP SLAVE first
5)
A B的server-id相同:
Last_IO_Error: Fatal error: The slave I/O thread stops because master and slave have equal MySQL server ids;
these ids must be different for replication to work (or the --replicate-same-server-id option must be used on
slave but this does not always make sense; please check the manual before using it).
查看server-id
mysql> show variables like 'server_id';
手动修改server-id
mysql> set global server_id=2; #此处的数值和my.cnf里设置的一样就行
mysql> slave start;
6)change master之后,查看slave的状态,发现slave_IO_running 为NO
需要注意的是,做完上述操作之后最后重启mysql进程
---------------------------------------
同步数据情况
A:在A上插入数据
mysql> create table aniya (id int not null auto_increment primary key, str varchar(255) not null);
mysql> insert into aniya(str) values
-> ('This is a master to master test table');
mysql> select * from aniya;
+----+---------------------------------------+
| id | str |
+----+---------------------------------------+
| 1 | This is a master to master test table |
+----+---------------------------------------+
1 row in set (0.00 sec)
查看B的日志:
[root@XKWB5705 var]# ls -lrth XKWB5705-relay-bin.000003
-rw-rw---- 1 mysql mysql 576 Sep 26 12:29 XKWB5705-relay-bin.000003
[root@XKWB5705 var]# more XKWB5705-relay-bin.000003
.in.N
(id int not null auto_increment primary key,
str varchar(255) not null)3
('This is a master to master test table')
-----
B A主从同步测试
在B上创建表lian,并插入数据
mysql> create table lian (a int,b char(10));
Query OK, 0 rows affected (0.01 sec)
mysql> insert into lian (a,b)values(22,hahah);
ERROR 1054 (42S22): Unknown column 'hahah' in 'field list'
mysql> insert into lian (a,b)values(22,'hahah');
Query OK, 1 row affected (0.00 sec)
mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| lian |
+----------------+
1 row in set (0.00 sec)
mysql> select * from lian;
+------+-------+
| a | b |
+------+-------+
| 22 | hahah |
+------+-------+
1 row in set (0.00 sec)
查看一下B的master日志,证明以上操作成功:
cat mysql-bin.000002
.?Nh?@stdtestcreate table lian (a int,b char(10))??Nl>@stdtestinsert into lian (a,b)values(22,'hahah')
现在查看从服务器A的relay日志,发现日志已经同步了
[root@XKWB5510 var]# cat XKWB5510-relay-bin.000003
.?Nh?@stdtestcreate table lian (a int,b char(10))??Nl>@stdtestinsert into lian (a,b)values(22,'hahah')
再在从服务器A上看一下数据库是不是存在lian这个表:
mysql> use test;
Database changed
mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| aniya |
| lian |
+----------------+
2 rows in set (0.00 sec)
现在说明数据B A 主 从 同步成功
---------------------------------------------------------------------------
测试A B主从
在A上创建表From246,并插入数据
mysql> use test;
Database changed
mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| A246 |
| aniya |
| lian |
+----------------+
3 rows in set (0.00 sec)
mysql> create table From246(Name varchar(255),Sex varchar(255),Age int(10));
Query OK, 0 rows affected (0.00 sec)
mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| A246 |
| From246 |
| aniya |
| lian |
+----------------+
4 rows in set (0.00 sec)
mysql> insert into From246 (Name,Sex,Age)values('Zhaoyj','Girl',24);
Query OK, 1 row affected (0.00 sec)
mysql> select * from From246;
+--------+------+------+
| Name | Sex | Age |
+--------+------+------+
| Zhaoyj | Girl | 24 |
+--------+------+------+
1 row in set (0.00 sec)
查看A的master日志,证明上述操作成功
[root@XKWB5510 var]# tail -1 mysql-bin.000002
testcreate table From246(Name varchar(255),Sex varchar(255),Age int(10))?N?R@stdtestinsert into From246 (Name,Sex,Age)values('Zhaoyj','Girl',24)
查看A的master日志状态
[root@XKWB5510 var]# /usr/local/mysql/bin/mysqlbinlog mysql-bin.000003 |tail -15
/*!*/;
# at 702
#110926 14:01:51 server id 1 end_log_pos 838 Query thread_id=5 exec_time=0 error_code=0
SET TIMESTAMP=1317016911/*!*/;
create table From246(Name varchar(255),Sex varchar(255),Age int(10))
/*!*/;
# at 838
#110926 14:02:05 server id 1 end_log_pos 966 Query thread_id=5 exec_time=0 error_code=0
SET TIMESTAMP=1317016925/*!*/;
insert into From246 (Name,Sex,Age)values('Zhaoyj','Girl',24)
/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
查看B的relay日志,同步日志成功
testcreate table From246(Name varchar(255),Sex varchar(255),Age int(10))?N?R@stdtestinsert into From246 (Name,Sex,Age)values('Zhaoyj','Girl',24)[root@XKWB5705 var]
查看B的relay日志状态
[root@XKWB5705 var]# /usr/local/mysql/bin/mysqlbinlog XKWB5705-relay-bin.000005|tail -13
/usr/local/mysql/bin/mysqlbinlog: Character set '#28' is not a compiled character set and is not specified in the '/usr/local/mysql/share/mysql/charsets/Index.xml' file
#110926 14:01:51 server id 1 end_log_pos 838 Query thread_id=5 exec_time=0 error_code=0
SET TIMESTAMP=1317016911/*!*/;
create table From246(Name varchar(255),Sex varchar(255),Age int(10))
/*!*/;
# at 853
#110926 14:02:05 server id 1 end_log_pos 966 Query thread_id=5 exec_time=0 error_code=0
SET TIMESTAMP=1317016925/*!*/;
insert into From246 (Name,Sex,Age)values('Zhaoyj','Girl',24)
/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
但是数据却没有插入数据库
mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| lian |
+----------------+
1 row in set (0.00 sec)
当我删除A上的表时,B的relay日志也同步了
[root@XKWB5705 var]# tail -4 XKWB5705-relay-bin.000005
??NS?@stdtestdrop table A246??NT@stdtestdrop table aniya??NSd@stdtestdrop table lian??NV?@stdtestdrop table From246
------------------------------------------------------------------------------
问题排查:
首先在Master上用
show processlist; 查看下进程是否Sleep太多。发现很正常。
show master status; 也正常。
再跑到Slave上查看也正常
show slave status;
发现一个问题:
当我手动从A导入B数据时
mysql> load table From246 from master;
ERROR 1115 (42000): Unknown character set: 'gbk'
怀疑:难道是因为字符串的问题导致AB主从复制失败 ?
通过show character set 命令查看到
A有gbk字符集而B没有
mysql> show character set;
+----------+-----------------------------+---------------------+--------+
| Charset | Description | Default collation | Maxlen |
+----------+-----------------------------+---------------------+--------+
| dec8 | DEC West European | dec8_swedish_ci | 1 |
| cp850 | DOS West European | cp850_general_ci | 1 |
| hp8 | HP West European | hp8_english_ci | 1 |
| koi8r | KOI8-R Relcom Russian | koi8r_general_ci | 1 |
| latin1 | cp1252 West European | latin1_swedish_ci | 1 |
| latin2 | ISO 8859-2 Central European | latin2_general_ci | 1 |
| swe7 | 7bit Swedish | swe7_swedish_ci | 1 |
| ascii | US ASCII | ascii_general_ci | 1 |
| hebrew | ISO 8859-8 Hebrew | hebrew_general_ci | 1 |
| koi8u | KOI8-U Ukrainian | koi8u_general_ci | 1 |
| greek | ISO 8859-7 Greek | greek_general_ci | 1 |
| cp1250 | Windows Central European | cp1250_general_ci | 1 |
| gbk | GBK Simplified Chinese | gbk_chinese_ci | 2 |
| latin5 | ISO 8859-9 Turkish | latin5_turkish_ci | 1 |
| armscii8 | ARMSCII-8 Armenian | armscii8_general_ci | 1 |
| utf8 | UTF-8 Unicode | utf8_general_ci | 3 |
| cp866 | DOS Russian | cp866_general_ci | 1 |
| keybcs2 | DOS Kamenicky Czech-Slovak | keybcs2_general_ci | 1 |
| macce | Mac Central European | macce_general_ci | 1 |
| macroman | Mac West European | macroman_general_ci | 1 |
| cp852 | DOS Central European | cp852_general_ci | 1 |
| latin7 | ISO 8859-13 Baltic | latin7_general_ci | 1 |
| cp1251 | Windows Cyrillic | cp1251_general_ci | 1 |
| cp1256 | Windows Arabic | cp1256_general_ci | 1 |
| cp1257 | Windows Baltic | cp1257_general_ci | 1 |
| binary | Binary pseudo charset | binary | 1 |
| geostd8 | GEOSTD8 Georgian | geostd8_general_ci | 1 |
+----------+-----------------------------+---------------------+--------+
27 rows in set (0.00 sec)
那现在应该是在启动mysql的时候统一他们的字符集
A :[root@XKWB5510 var]# /usr/local/mysql/bin/mysqld_safe --default-character-set=latin1 &
B :[root@XKWB5705 var]# /usr/local/mysql/bin/mysqld_safe --default-character-set=latin1 &
在B上从A导入数据:
mysql> show tables;
Empty set (0.00 sec)
mysql> load table From246 from master;
Query OK, 0 rows affected (0.01 sec)
mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| From246 |
+----------------+
1 row in set (0.00 sec)
现在字符集的问题解决了
-----------------------------------
现在手动启动一下“将日志应用于数据库”的线程:SLAVE start SQL_THREAD
和“把master段的日志写到本地”的线程:SLAVE start IO_THREAD
发现同步数据还是失败,那说明不是线程的问题
如果发现 Seconds_Behind_Master 为 (null)
解决:
stop slave;
set global sql_slave_skip_counter =1 ;
start slave;
之后Slave会和Master去同步 主要看Seconds_Behind_Master是否为0,直到为0时就已经同步了。。
-----------------------------------
slave B机器上master.info信息,与master A上的信息是否是同步的
mater A:
mysql> show master status\G;
*************************** 1. row ***************************
File: mysql-bin.000004
Position: 808
Binlog_Do_DB: test
Binlog_Ignore_DB: mysql
1 row in set (0.00 sec)
slave B:
[root@XKWB5705 var]# cat master.info
15
mysql-bin.000004
808
211.100.97.246
repl2
123456
3306
60
0
从以上可以看到是同步的
--------------------------------------------
flush master
flush slave