DB2和MYSQL的LOAD原理和比较测试

DB2 load的过程:
(1)、装入阶段

      装入阶段将源数据解析成物理数据页的格式,直接装入到数据页中。必要时还收集索引键和表统计信息。

 (2)、构建索引阶段

      根据在装入阶段收集的索引键创建表索引。
(3)、删除重复阶段

      在此阶段将违反了主键约束和唯一约束的行删除(主键约束包含了唯一约束),如果定义了异常表,这些删除的行将会插入到异常表中,在装载完成后可以查询异常来查看那些行违反了唯一性约束

测试:
[db2inst1@t3-dtpoc-dtpoc-web04 ~]$ db2 "create table employee(id int not null primary key,name varchar(10))"
DB20000I  The SQL command completed successfully.
[db2inst1@t3-dtpoc-dtpoc-web04 ~]$ db2 "create table employee_exception(id int not null,name varchar(10))"                    
DB20000I  The SQL command completed successfully.

vi employee.del
1,liys
2,zhangs
3,wangw
4,lius
2,error

[db2inst1@t3-dtpoc-dtpoc-web04 liys]$ db2 "load from employee.del of del insert into employee for exception employee_exception nonrecoverable"
SQL3109N  The utility is beginning to load data from file 
"/home/db2inst1/liys/employee.del".

SQL3500W  The utility is beginning the "LOAD" phase at time "08/30/2023 
14:27:58.655991".

SQL3519W  Begin Load Consistency Point. Input record count = "0".

SQL3520W  Load Consistency Point was successful.

SQL3110N  The utility has completed processing.  "5" rows were read from the 
input file.

SQL3519W  Begin Load Consistency Point. Input record count = "5".

SQL3520W  Load Consistency Point was successful.

SQL3515W  The utility has finished the "LOAD" phase at time "08/30/2023 
14:27:58.702249".

SQL3500W  The utility is beginning the "BUILD" phase at time "08/30/2023 
14:27:58.706067".

SQL3213I  The indexing mode is "REBUILD".

SQL3515W  The utility has finished the "BUILD" phase at time "08/30/2023 
14:27:58.764588".

SQL3500W  The utility is beginning the "DELETE" phase at time "08/30/2023 
14:27:58.790554".

SQL3509W  The utility has deleted "1" rows from the table.

SQL3515W  The utility has finished the "DELETE" phase at time "08/30/2023 
14:27:58.798497".


Number of rows read         = 5
Number of rows skipped      = 0
Number of rows loaded       = 5
Number of rows rejected     = 0
Number of rows deleted      = 1
Number of rows committed    = 5

可以看到主键重复的行被扔进了异常表。

[db2inst1@t3-dtpoc-dtpoc-web04 liys]$ db2 "select * from employee_exception"

ID          NAME      
----------- ----------
          2 error 

值得注意的是
Oracle端:’C’和’C ‘被认为是两个不同字符。
Db2端:’C’和’C ‘则是相同字符,导致其中的’C ‘插入失败,报出SQL0803N。所从ORACLE导出数据然后Load进DB2的时候,会有主键重复的情况。这也是复制软件如CDC复制时经常报主键冲突的原因,解决方法就是设置主键冲突源获胜,目标端丢弃重复数据。

[db2inst1@t3-dtpoc-dtpoc-web04 liys]$ db2 "create table employee1(id varchar(10) not null primary key,name varchar(10))"
DB20000I  The SQL command completed successfully.
[db2inst1@t3-dtpoc-dtpoc-web04 liys]$ db2 "insert into employee1 values('liys  ','llljjj  ')"
DB20000I  The SQL command completed successfully.
[db2inst1@t3-dtpoc-dtpoc-web04 liys]$ db2 "select id||'AAA', name||'AAA' from employee1"

1             2            
------------- -------------
liys  AAA     llljjj  AAA  

  1 record(s) selected.

[db2inst1@t3-dtpoc-dtpoc-web04 liys]$ db2 "insert into employee1 values('liys','llljjj  ')"  
DB21034E  The command was processed as an SQL statement because it was not a 
valid Command Line Processor command.  During SQL processing it returned:
SQL0803N  One or more values in the INSERT statement, UPDATE statement, or 
foreign key update caused by a DELETE statement are not valid because the 
primary key, unique constraint or unique index identified by "1" constrains 
table "DB2INST1.EMPLOYEE1" from having duplicate values for the index key.  
SQLSTATE=23505
 
 但是如果前边有空格呢?不会主键冲突,DB2也会认为不冲突
 [db2inst1@t3-dtpoc-dtpoc-web04 liys]$ db2 "insert into employee1 values('  liys','llljjj  ')"   
DB20000I  The SQL command completed successfully.
[db2inst1@t3-dtpoc-dtpoc-web04 liys]$ db2 "select id||'AAA', name||'AAA' from employee1"

1             2            
------------- -------------
liys  AAA     llljjj  AAA  
  liysAAA     llljjj  AAA  

  2 record(s) selected.
  
  
 如果输入的字符长度超过varchar(10)呢,会自动截断还是插入失败呢? DB2插入会报错失败,但是Load会截断然后插入。
 
 
Number of rows read         = 2
Number of rows skipped      = 0
Number of rows loaded       = 2
Number of rows rejected     = 0
Number of rows deleted      = 0
Number of rows committed    = 2

[db2inst1@t3-dtpoc-dtpoc-web04 liys]$ db2 "select *  from employee"

ID          NAME      
----------- ----------
          1 fj        
          2 waizhonggu

  2 record(s) selected.
  
  [db2inst1@t3-dtpoc-dtpoc-web04 liys]$ db2 "insert into employee1 values (3,'slafdjlakjfsadhflhaslfhalsf')"
DB21034E  The command was processed as an SQL statement because it was not a 
valid Command Line Processor command.  During SQL processing it returned:
SQL0433N  Value "slafdjlakjfsadhflhaslfhalsf" is too long.  SQLSTATE=22001
[db2inst1@t3-dtpoc-dtpoc-web04 liys]$ db2 "select *  from employee"                                       

ID          NAME      
----------- ----------
          1 fj        
          2 waizhonggu

  2 record(s) selected.


  
MYSQL测试;
load data local infile "/home/mysql/liys/employee1.del" into table user lines terminated by '\n';

mysql> load data local infile "/home/mysql/liys/employee1.del" into table employee lines terminated by '\n';
Query OK, 4 rows affected, 11 warnings (0.00 sec)
Records: 5  Deleted: 0  Skipped: 1  Warnings: 11

mysql> select * from employee;
+----+------+
| id | name |
+----+------+
|  1 | NULL |
|  2 | NULL |
|  3 | NULL |
|  4 | NULL |
+----+------+
4 rows in set (0.00 sec)

发现id=5的被SKIPPED,但是name列都是NULL,加上FIELDS TERMINATED BY ','就可以了

mysql> load data local infile "/home/mysql/liys/employee1.del" into table employee FIELDS TERMINATED BY ',' lines terminated by '\n';
Query OK, 4 rows affected, 1 warning (0.01 sec)
Records: 5  Deleted: 0  Skipped: 1  Warnings: 1

mysql> select * from employee;
+----+--------+
| id | name   |
+----+--------+
|  1 | liy    |
|  2 | zhangs |
|  3 | wangw  |
|  4 | lius   |
+----+--------+
4 rows in set (0.00 sec)

mysql> 

那MYSQL端:’C’和’C ‘被认为是两个不同字符吗?测试发现MYSQL和DB2是完全一致的

mysql> create table employee1(id varchar(10) not null primary key,name varchar(10));
Query OK, 0 rows affected (0.01 sec)

mysql> insert into employee1 values('liys  ','llljjj  ');
Query OK, 1 row affected (0.01 sec)
mysql> select concat(id,'AAA'),concat(name,'AAA') from employee1;                                        
+------------------+--------------------+
| concat(id,'AAA') | concat(name,'AAA') |
+------------------+--------------------+
| liys  AAA        | llljjj  AAA        |
+------------------+--------------------+
1 row in set (0.00 sec)

mysql> insert into employee1 values('liys','llljjj  ');
ERROR 1062 (23000): Duplicate entry 'liys' for key 'PRIMARY'

      
mysql> insert into employee1 values('liys1','llljjj  ');
Query OK, 1 row affected (0.01 sec)

mysql> select concat(id,'AAA'),concat(name,'AAA') from employee1;
+------------------+--------------------+
| concat(id,'AAA') | concat(name,'AAA') |
+------------------+--------------------+
| liys  AAA        | llljjj  AAA        |
| liys1AAA         | llljjj  AAA        |
+------------------+--------------------+
2 rows in set (0.00 sec)

mysql> insert into employee1 values('  liys1','llljjj  ');

mysql> insert into employee1 values('  liys1','llljjj  ');
Query OK, 1 row affected (0.00 sec)

mysql> select concat(id,'AAA'),concat(name,'AAA') from employee1;
+------------------+--------------------+
| concat(id,'AAA') | concat(name,'AAA') |
+------------------+--------------------+
|   liys1AAA       | llljjj  AAA        |
| liys  AAA        | llljjj  AAA        |
| liys1AAA         | llljjj  AAA        |
+------------------+--------------------+
3 rows in set (0.00 sec)

mysql> insert into employee1 values('  liys','llljjj  ');
Query OK, 1 row affected (0.01 sec)

mysql> select concat(id,'AAA'),concat(name,'AAA') from employee1;
+------------------+--------------------+
| concat(id,'AAA') | concat(name,'AAA') |
+------------------+--------------------+
|   liysAAA        | llljjj  AAA        |
|   liys1AAA       | llljjj  AAA        |
| liys  AAA        | llljjj  AAA        |
| liys1AAA         | llljjj  AAA        |
+------------------+--------------------+
4 rows in set (0.00 sec)

 
 如果输入的字符长度超过varchar(10)呢,会自动截断还是插入失败呢?测试结果MYSQL和DB2完全一样, 插入会报错失败,但是Load会截断然后插入。
 
vi employee1.del
3,fhjl
6,woaizhogngguoodddafafsaa

mysql> load data local infile "/home/mysql/liys/employee1.del" into table employee FIELDS TERMINATED BY ',' lines terminated by '\n';
Query OK, 2 rows affected, 1 warning (0.00 sec)
Records: 2  Deleted: 0  Skipped: 0  Warnings: 1

mysql> select * from employee;
+----+------------+
| id | name       |
+----+------------+
|  3 | fhjl       |
|  6 | woaizhogng |
+----+------------+
2 rows in set (0.00 sec)


mysql> insert into employee values (234,'afasfhflashflashflaisfasfsafassaf');
ERROR 1406 (22001): Data too long for column 'name' at row 1
mysql> 

你可能感兴趣的:(mysql,数据库)