DB2 load的过程:
(1)、装入阶段
装入阶段将源数据解析成物理数据页的格式,直接装入到数据页中。必要时还收集索引键和表统计信息。
(2)、构建索引阶段
根据在装入阶段收集的索引键创建表索引。
(3)、删除重复阶段
在此阶段将违反了主键约束和唯一约束的行删除(主键约束包含了唯一约束),如果定义了异常表,这些删除的行将会插入到异常表中,在装载完成后可以查询异常来查看那些行违反了唯一性约束
测试:
[db2inst1@t3-dtpoc-dtpoc-web04 ~]$ db2 "create table employee(id int not null primary key,name varchar(10))"
DB20000I The SQL command completed successfully.
[db2inst1@t3-dtpoc-dtpoc-web04 ~]$ db2 "create table employee_exception(id int not null,name varchar(10))"
DB20000I The SQL command completed successfully.
vi employee.del
1,liys
2,zhangs
3,wangw
4,lius
2,error
[db2inst1@t3-dtpoc-dtpoc-web04 liys]$ db2 "load from employee.del of del insert into employee for exception employee_exception nonrecoverable"
SQL3109N The utility is beginning to load data from file
"/home/db2inst1/liys/employee.del".
SQL3500W The utility is beginning the "LOAD" phase at time "08/30/2023
14:27:58.655991".
SQL3519W Begin Load Consistency Point. Input record count = "0".
SQL3520W Load Consistency Point was successful.
SQL3110N The utility has completed processing. "5" rows were read from the
input file.
SQL3519W Begin Load Consistency Point. Input record count = "5".
SQL3520W Load Consistency Point was successful.
SQL3515W The utility has finished the "LOAD" phase at time "08/30/2023
14:27:58.702249".
SQL3500W The utility is beginning the "BUILD" phase at time "08/30/2023
14:27:58.706067".
SQL3213I The indexing mode is "REBUILD".
SQL3515W The utility has finished the "BUILD" phase at time "08/30/2023
14:27:58.764588".
SQL3500W The utility is beginning the "DELETE" phase at time "08/30/2023
14:27:58.790554".
SQL3509W The utility has deleted "1" rows from the table.
SQL3515W The utility has finished the "DELETE" phase at time "08/30/2023
14:27:58.798497".
Number of rows read = 5
Number of rows skipped = 0
Number of rows loaded = 5
Number of rows rejected = 0
Number of rows deleted = 1
Number of rows committed = 5
可以看到主键重复的行被扔进了异常表。
[db2inst1@t3-dtpoc-dtpoc-web04 liys]$ db2 "select * from employee_exception"
ID NAME
----------- ----------
2 error
值得注意的是
Oracle端:’C’和’C ‘被认为是两个不同字符。
Db2端:’C’和’C ‘则是相同字符,导致其中的’C ‘插入失败,报出SQL0803N。所从ORACLE导出数据然后Load进DB2的时候,会有主键重复的情况。这也是复制软件如CDC复制时经常报主键冲突的原因,解决方法就是设置主键冲突源获胜,目标端丢弃重复数据。
[db2inst1@t3-dtpoc-dtpoc-web04 liys]$ db2 "create table employee1(id varchar(10) not null primary key,name varchar(10))"
DB20000I The SQL command completed successfully.
[db2inst1@t3-dtpoc-dtpoc-web04 liys]$ db2 "insert into employee1 values('liys ','llljjj ')"
DB20000I The SQL command completed successfully.
[db2inst1@t3-dtpoc-dtpoc-web04 liys]$ db2 "select id||'AAA', name||'AAA' from employee1"
1 2
------------- -------------
liys AAA llljjj AAA
1 record(s) selected.
[db2inst1@t3-dtpoc-dtpoc-web04 liys]$ db2 "insert into employee1 values('liys','llljjj ')"
DB21034E The command was processed as an SQL statement because it was not a
valid Command Line Processor command. During SQL processing it returned:
SQL0803N One or more values in the INSERT statement, UPDATE statement, or
foreign key update caused by a DELETE statement are not valid because the
primary key, unique constraint or unique index identified by "1" constrains
table "DB2INST1.EMPLOYEE1" from having duplicate values for the index key.
SQLSTATE=23505
但是如果前边有空格呢?不会主键冲突,DB2也会认为不冲突
[db2inst1@t3-dtpoc-dtpoc-web04 liys]$ db2 "insert into employee1 values(' liys','llljjj ')"
DB20000I The SQL command completed successfully.
[db2inst1@t3-dtpoc-dtpoc-web04 liys]$ db2 "select id||'AAA', name||'AAA' from employee1"
1 2
------------- -------------
liys AAA llljjj AAA
liysAAA llljjj AAA
2 record(s) selected.
如果输入的字符长度超过varchar(10)呢,会自动截断还是插入失败呢? DB2插入会报错失败,但是Load会截断然后插入。
Number of rows read = 2
Number of rows skipped = 0
Number of rows loaded = 2
Number of rows rejected = 0
Number of rows deleted = 0
Number of rows committed = 2
[db2inst1@t3-dtpoc-dtpoc-web04 liys]$ db2 "select * from employee"
ID NAME
----------- ----------
1 fj
2 waizhonggu
2 record(s) selected.
[db2inst1@t3-dtpoc-dtpoc-web04 liys]$ db2 "insert into employee1 values (3,'slafdjlakjfsadhflhaslfhalsf')"
DB21034E The command was processed as an SQL statement because it was not a
valid Command Line Processor command. During SQL processing it returned:
SQL0433N Value "slafdjlakjfsadhflhaslfhalsf" is too long. SQLSTATE=22001
[db2inst1@t3-dtpoc-dtpoc-web04 liys]$ db2 "select * from employee"
ID NAME
----------- ----------
1 fj
2 waizhonggu
2 record(s) selected.
MYSQL测试;
load data local infile "/home/mysql/liys/employee1.del" into table user lines terminated by '\n';
mysql> load data local infile "/home/mysql/liys/employee1.del" into table employee lines terminated by '\n';
Query OK, 4 rows affected, 11 warnings (0.00 sec)
Records: 5 Deleted: 0 Skipped: 1 Warnings: 11
mysql> select * from employee;
+----+------+
| id | name |
+----+------+
| 1 | NULL |
| 2 | NULL |
| 3 | NULL |
| 4 | NULL |
+----+------+
4 rows in set (0.00 sec)
发现id=5的被SKIPPED,但是name列都是NULL,加上FIELDS TERMINATED BY ','就可以了
mysql> load data local infile "/home/mysql/liys/employee1.del" into table employee FIELDS TERMINATED BY ',' lines terminated by '\n';
Query OK, 4 rows affected, 1 warning (0.01 sec)
Records: 5 Deleted: 0 Skipped: 1 Warnings: 1
mysql> select * from employee;
+----+--------+
| id | name |
+----+--------+
| 1 | liy |
| 2 | zhangs |
| 3 | wangw |
| 4 | lius |
+----+--------+
4 rows in set (0.00 sec)
mysql>
那MYSQL端:’C’和’C ‘被认为是两个不同字符吗?测试发现MYSQL和DB2是完全一致的
mysql> create table employee1(id varchar(10) not null primary key,name varchar(10));
Query OK, 0 rows affected (0.01 sec)
mysql> insert into employee1 values('liys ','llljjj ');
Query OK, 1 row affected (0.01 sec)
mysql> select concat(id,'AAA'),concat(name,'AAA') from employee1;
+------------------+--------------------+
| concat(id,'AAA') | concat(name,'AAA') |
+------------------+--------------------+
| liys AAA | llljjj AAA |
+------------------+--------------------+
1 row in set (0.00 sec)
mysql> insert into employee1 values('liys','llljjj ');
ERROR 1062 (23000): Duplicate entry 'liys' for key 'PRIMARY'
mysql> insert into employee1 values('liys1','llljjj ');
Query OK, 1 row affected (0.01 sec)
mysql> select concat(id,'AAA'),concat(name,'AAA') from employee1;
+------------------+--------------------+
| concat(id,'AAA') | concat(name,'AAA') |
+------------------+--------------------+
| liys AAA | llljjj AAA |
| liys1AAA | llljjj AAA |
+------------------+--------------------+
2 rows in set (0.00 sec)
mysql> insert into employee1 values(' liys1','llljjj ');
mysql> insert into employee1 values(' liys1','llljjj ');
Query OK, 1 row affected (0.00 sec)
mysql> select concat(id,'AAA'),concat(name,'AAA') from employee1;
+------------------+--------------------+
| concat(id,'AAA') | concat(name,'AAA') |
+------------------+--------------------+
| liys1AAA | llljjj AAA |
| liys AAA | llljjj AAA |
| liys1AAA | llljjj AAA |
+------------------+--------------------+
3 rows in set (0.00 sec)
mysql> insert into employee1 values(' liys','llljjj ');
Query OK, 1 row affected (0.01 sec)
mysql> select concat(id,'AAA'),concat(name,'AAA') from employee1;
+------------------+--------------------+
| concat(id,'AAA') | concat(name,'AAA') |
+------------------+--------------------+
| liysAAA | llljjj AAA |
| liys1AAA | llljjj AAA |
| liys AAA | llljjj AAA |
| liys1AAA | llljjj AAA |
+------------------+--------------------+
4 rows in set (0.00 sec)
如果输入的字符长度超过varchar(10)呢,会自动截断还是插入失败呢?测试结果MYSQL和DB2完全一样, 插入会报错失败,但是Load会截断然后插入。
vi employee1.del
3,fhjl
6,woaizhogngguoodddafafsaa
mysql> load data local infile "/home/mysql/liys/employee1.del" into table employee FIELDS TERMINATED BY ',' lines terminated by '\n';
Query OK, 2 rows affected, 1 warning (0.00 sec)
Records: 2 Deleted: 0 Skipped: 0 Warnings: 1
mysql> select * from employee;
+----+------------+
| id | name |
+----+------------+
| 3 | fhjl |
| 6 | woaizhogng |
+----+------------+
2 rows in set (0.00 sec)
mysql> insert into employee values (234,'afasfhflashflashflaisfasfsafassaf');
ERROR 1406 (22001): Data too long for column 'name' at row 1
mysql>