MERGE
Merge操作要解决的问题
在我们进行数据处理过程中,常常遇到如下操作过程:
用源表数据更新目标表过程,其操作的方式为:
1、 如果满足一定的关联条件,就用源表数据更新目标表;
2、 如果不满足一定的关联条件,就认为是新的数据,将这些数据插入到目标表。
先假定有以下的表数据
EMP_SOURCE |
EMP_TARGET |
|||||||
EMPNO |
ENAME |
SAL |
|
EMPNO |
ENAME |
SAL |
||
7566 |
梅兰芳 |
1100.00 |
7369 |
SMITH |
800.00 |
|||
7654 |
孔老二 |
1250.00 |
7499 |
ALLEN |
1600.00 |
|||
7698 |
李元霸 |
2850.00 |
7521 |
WARD |
1.00 |
|||
7782 |
关羽 |
2450.00 |
7566 |
JONES |
2.00 |
|||
7788 |
赵云 |
3000.00 |
7654 |
MARTIN |
3.00 |
|||
7839 |
雷锋 |
5000.00 |
7698 |
BLAKE |
4.00 |
|||
7844 |
忍着笙贵 |
1500.00 |
7782 |
CLARK |
2450.00 |
|||
7876 |
黄石人 |
1100.00 |
7788 |
SCOTT |
3000.00 |
|||
7900 |
八大山人 |
950.00 |
7839 |
KING |
5000.00 |
|||
7902 |
郭景 |
3000.00 |
|
|
|
|||
7934 |
忍者神龟 |
1300.00 |
|
EMP_SOURCE的表中彩色部分的EMPNO在EMP_TARGET中存在,而白色部分是新的数据。现在我们用传统方法实现merge操作。
insert into emp_target
select * from emp_source
where enpno not in (select enpnofrom emp_target);
update emp_target a
set a.ename=(select ename fromemp_source where enpno =a.enpno)
where enpno in (select enpno fromemp_source);
离线操作
select nvl(b.enpno,a.enpno) empno,
nvl(b.ename,a.ename) ename,
nvl(b.sal,a.sal) sal
from emp_target a full outer joinemp_source b
on (a.enpno=b.enpno)
传统的做法也能实现,但是需要多条DML语句,且其中用到反关联,在某一些数据库中其效率很低。
在ORACLE9I版本开始,提供了一种新的操作方式MERGE,与DELETE INSERT UPDATE一道提供数据的DML操作,在10G中丰富了MERGE的操作方法,下面给出事例。
merge into emp_target t1
using emp_source t2
on (t1.empno= t2.empno)
when matched then
update
setename = t2.ename
when not matched then
insert (empno,ename,sal)
values (t2.empno,t2.ename,t2.sal)
MERGE后的结果如下:
EMPNO |
ENAME |
SAL |
7369 |
SMITH |
800.00 |
7499 |
ALLEN |
1600.00 |
7521 |
WARD |
1.00 |
7566 |
梅兰芳 |
2.00 |
7654 |
孔老二 |
3.00 |
7698 |
李元霸 |
4.00 |
7782 |
关羽 |
2450.00 |
7788 |
赵云 |
3000.00 |
7839 |
雷锋 |
5000.00 |
7844 |
忍着笙贵 |
1500.00 |
7934 |
忍着笙贵 |
1300.00 |
7902 |
郭景 |
3000.00 |
7876 |
黄石人 |
1100.00 |
7900 |
八大山人 |
950.00 |
当然MERGE支持例外数据处理
在目标表上建立2个约束
EMPNO是主键
ALTER TABLE EMP_TARGET ADD CONSTRAINT EMP_TARGET_PKPRIMARY KEY (EMPNO) ENABLE VALIDATE
员工名称唯一性
ALTER TABLE EMP_TARGET ADD CONSTRAINT NAME_UNIQUE UNIQUE (ENAME)
ENABLE VALIDATE
建立例外表
begin
DBMS_ERRLOG.create_error_log('EMP_TARGET','EXCEP_FEMP_TARGET','SCOTT');
end;
带有例外处理的MERGE
merge intoemp_target t1
using emp_source t2
on (t1.empno= t2.empno)
when matched then
update set ename= t2.ename
when not matched then
insert(empno,ename,sal) values (t2.empno,t2.ename,t2.sal)
LOG ERRORS INTOEXCEP_FEMP_TARGET('load_act1')
REJECT LIMIT UNLIMITED;
运行后会在例外表中出现“忍着笙贵”的记录。
ALTERTABLEEMP_SOURCERENAMECOLUMNENPNO TO EMPNO;
ALTERTABLEEMP_TARGETRENAMECOLUMNENPNO TO EMPNO;
在10G中增加的功能
1、 增加DELETE
merge into emp_target t1
using emp_source t2
on (t1.empno= t2.empno)
when matched then
update set t1.ename=t2.ename,
t1.sal=t1.sal+t2.sal
WHERE T2.SAL <4000
deletewhere t1.sal> 4000
when not matched then
insert (T1.EMPNO,T1.ENAME,T1.SAL)
values(T2.EMPNO,T2.ENAME,T2.SAL) ;
2、 可以只考虑匹配的一种情况
只考虑不匹配
merge intoemp_target t1
using emp_source t2
on (t1.empno= t2.empno)
--when matchedthen
--update set t1.ename=t2.ename,
-- t1.sal=t1.sal+t2.sal
-- WHERE T2.SAL <4000
-- delete where t1.sal> 4000
when not matched then
insert (T1.EMPNO,T1.ENAME,T1.SAL)
values (T2.EMPNO,T2.ENAME,T2.SAL)
也可以只考虑不匹配
merge intoemp_target t1
using emp_source t2
on (t1.empno= t2.empno)
when matched then
update set t1.ename=t2.ename,
t1.sal=t1.sal+t2.sal
WHERE T2.SAL <4000
delete where t1.sal> 4000
--when notmatched then
--insert (T1.EMPNO,T1.ENAME,T1.SAL)
-- values (T2.EMPNO,T2.ENAME,T2.SAL)
这提供了用MERGE代替INSERT与UPDATE的操作方法。
一个BUG现象,由于增加了DELETE,例外处理不起作用
merge into emp_target t1
using emp_source t2
on (t1.empno= t2.empno)
when matched then
update set t1.ename=t2.ename,
t1.sal=t1.sal+t2.sal
WHERE T2.SAL <4000
delete where t1.sal> 4000
when not matched then
insert (T1.EMPNO,T1.ENAME,T1.SAL)
values (T2.EMPNO,T2.ENAME,T2.SAL)
LOG ERRORS INTO
EXCEP_FEMP_TARGET('load_act2')
REJECT LIMITUNLIMITED;
这个bug在11g中解决。
MERGE的工作机制
for x in ( select * from source )
loop
if record exists --读到SCOURCE表结尾,就退出
then
if record hasn't been modified by us yet
then
update it
else
fail --如果源表的关联条件不能满足主键约束,则失败
--不知道用哪一条数据更新目标表
--发生ORA-30926错误,无法从源表中获得一组稳定的行
end if
else
insert it
end if
end loop
这只是伪代码,在实际执行过程中,MERGE并不是一行一行地来执行,而是批操作。
说明:
create table t1 ( x int, y int );
create table t2 ( x int, y int );
insert into t1 values ( 1, 3 );
insert into t1 values ( 2, 1 );
merge into t2 using t1
on (t1.x = t2.x)
when matched then update set y = t1.y
when not matched then insert (x,y) values (t1.x,t1.y);
输出:2 rowsmerged.
insert into t1 values ( 1, 1 );
merge into t2 usingt1
on (t1.x = t2.x)
when matched then update set y = t1.y
when not matched then insert (x,y) values (t1.x,t1.y);
merge into t2 using t1
发生ORA-30926错误,无法从源表中获得一组稳定的行
因此,如同UPDATE操作一样,MERGE隐含一个先决条件,不能一次操作过程中用2条数据去UPDATE目标中的一条记录,因此源表一般隐含着在关联字段上存在不重复的约束,然而,即使在源表中同一个关联字段下有重复记录,如果目标表不存在这个关联字段的记录,那么不会发生UPDATE操作,而是将这2条记录全部插入目标表,不会发生ORA-30926错误。
特别是目标表为空,实际上,这个时候的MERGE操作,就是INSERT操作,如
10g>select * from data;
C1 C2
---------- ----------
row 1 column 1
row 2 column 0
row 0 column 1
row 1 column 0
row 2 column 1
row 0 column 0
row 1 column 1
row 2 column 0
row 0 column 1
select distinct c1,c2 from data;
C1 C2
---------- ----------
row 1 column 1
row 2 column 0
row 0 column 0
row 0 column 1
row 1 column 0
row 2 column 1
有3条记录在源表中重复
select * from target;
no rows selected
merge into target t using data d
on (t.c1=d.c1 and t.c2=d.c2)
when not matched then
insert values(d.c1,d.c2)
9 rows merged. 实际上插入了9条记录
select * fromtarget;
C1 C2
---------- ----------
row 0 column 1
row 0 column 1
row 2 column 0
row 2 column 0
row 1 column 0
row 1 column 1
row 1 column 1
row 2 column 1
row 0 column 0
9 rows selected.
merge into target t using data d
on (t.c1=d.c1 and t.c2=d.c2)
when not matched then
insert values(d.c1,d.c2)
再次运行,怎么会有以下情况
0 rows merged.
在merge语句有多个地方可以增加filter条件
merge intoemp_target t1
using emp_source t2
on (t1.empno= t2.empno)
when matched then
update sett1.ename='U_'||t2.ename,
t1.sal=t1.sal+t2.sal
WHERE T2.SAL<2000
when not matched then
insert(T1.EMPNO,T1.ENAME,T1.SAL)
values(T2.EMPNO,'I_'||T2.ENAME,T2.SAL);
EMPNO |
ENAME |
SAL |
7369 |
SMITH |
800.00 |
7499 |
ALLEN |
1600.00 |
7521 |
WARD |
1.00 |
7566 |
U_梅兰芳 |
1102.00 |
7654 |
U_孔老二 |
1253.00 |
7698 |
BLAKE |
4.00 |
7782 |
CLARK |
2450.00 |
7788 |
SCOTT |
3000.00 |
7839 |
KING |
5000.00 |
7844 |
I_忍着笙贵 |
1500.00 |
7934 |
I_忍者神龟 |
1300.00 |
7902 |
I_郭景 |
3000.00 |
7876 |
I_黄石人 |
1100.00 |
7900 |
I_八大山人 |
950.00 |
merge intoemp_target t1
using emp_source t2
on (t1.empno= t2.empno and T2.SAL<2000)
when matched then
update sett1.ename='U_'||t2.ename,
t1.sal=t1.sal+t2.sal
--WHERE T2.SAL <2000
when not matched then
insert(T1.EMPNO,T1.ENAME,T1.SAL)
values(T2.EMPNO,'I_'||T2.ENAME,T2.SAL)
EMPNO |
ENAME |
SAL |
7369 |
SMITH |
800.00 |
7499 |
ALLEN |
1600.00 |
7521 |
WARD |
1.00 |
7566 |
U_梅兰芳 |
1102.00 |
7654 |
U_孔老二 |
1253.00 |
7698 |
BLAKE |
4.00 |
7782 |
CLARK |
2450.00 |
7788 |
SCOTT |
3000.00 |
7839 |
KING |
5000.00 |
7698 |
I_李元霸 |
2850.00 |
7782 |
I_关羽 |
2450.00 |
7788 |
I_赵云 |
3000.00 |
7839 |
I_雷锋 |
5000.00 |
7844 |
I_忍着笙贵 |
1500.00 |
7876 |
I_黄石人 |
1100.00 |
7900 |
I_八大山人 |
950.00 |
7902 |
I_郭景 |
3000.00 |
7934 |
I_忍者神龟 |
1300.00 |
merge intoemp_target t1
using emp_source t2
on (t1.empno= t2.empno)-- andT2.SAL <2000)
when matched then
update sett1.ename='U_'||t2.ename,
t1.sal=t1.sal+t2.sal
--WHERET2.SAL <2000
when not matched then
insert(T1.EMPNO,T1.ENAME,T1.SAL)
values(T2.EMPNO,'I_'||T2.ENAME,T2.SAL)
EMPNO |
ENAME |
SAL |
7369 |
SMITH |
800.00 |
7499 |
ALLEN |
1600.00 |
7521 |
WARD |
1.00 |
7566 |
U_梅兰芳 |
1102.00 |
7654 |
U_孔老二 |
1253.00 |
7698 |
U_李元霸 |
2854.00 |
7782 |
U_关羽 |
4900.00 |
7788 |
U_赵云 |
6000.00 |
7839 |
U_雷锋 |
10000.00 |
7844 |
I_忍着笙贵 |
1500.00 |
7934 |
I_忍者神龟 |
1300.00 |
7902 |
I_郭景 |
3000.00 |
7876 |
I_黄石人 |
1100.00 |
7900 |
I_八大山人 |
950.00 |
MERGE的效率问题
begin
for i in 1..900009 loop
insert into EMP_target
values(i,'S_'||dbms_random.string('i',8),dbms_random.value(1000,2000));
end loop;
end;
begin
for i in 1..9000 loop
insert into EMP_source
values(i,'S_'||dbms_random.string('i',8),dbms_random.value(1000,2000));
end loop;
end;
analyze tableemp_target compute statistics
for table
for all indexed columns
for all indexes
analyze tableemp_source compute statistics
for table
for all indexed columns
for all indexes
explain planfor
merge intoemp_target t1
using emp_source t2
on (t1.empno= t2.empno)
when matched then
update set t1.ename='U_'||t2.ename,
t1.sal=t1.sal+t2.sal
when not matched then
insert (T1.EMPNO,T1.ENAME,T1.SAL)
values (T2.EMPNO,'I_'||T2.ENAME,T2.SAL)
select * fromtable(dbms_xplan.display());
-----------------------------------------------------------------------------------
| Id |Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | MERGESTATEMENT | | 9000 | 966K| 911 (2)| 00:00:11 |
| 1 | MERGE | EMP_TARGET | | | | |
| 2 | VIEW | | | | | |
|* 3 | HASH JOIN OUTER | | 9000 | 413K| 911 (2)| 00:00:11 |
| 4 | TABLE ACCESS FULL| EMP_SOURCE | 9000 | 202K| 11 (0)| 00:00:01|
| 5 | TABLE ACCESS FULL| EMP_TARGET | 900K| 20M| 896 (2)| 00:00:11|
-----------------------------------------------------------------------------------
我们看到索引没有被用上
explain plan for
merge into emp_target t1
using (select * from emp_source where empno>0 andempno<10000) t2
on (t1.empno= t2.empno)
when matched then
update set t1.ename='U_'||t2.ename,
t1.sal=t1.sal+t2.sal
when not matched then
insert (T1.EMPNO,T1.ENAME,T1.SAL)
values (T2.EMPNO,'I_'||T2.ENAME,T2.SAL)
select * from table(dbms_xplan.display());
Plan hash value: 4090952343
---------------------------------------------------------------------------------------------
| Id | Operation |Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | MERGE STATEMENT | | 9000 | 966K| 73 (2)| 00:00:01 |
| 1 | MERGE | EMP_TARGET | | | | |
| 2 | VIEW | | | | | |
|* 3 | HASH JOIN OUTER | | 9000 | 413K| 73 (2)| 00:00:01 |
|* 4 | TABLE ACCESS FULL | EMP_SOURCE | 9000 | 202K| 11 (0)| 00:00:01 |
| 5 | TABLE ACCESS BY INDEX ROWID|EMP_TARGET | 9999 | 234K| 61 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | EMP_PK_T | 9999 | | 25 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
explain plan for
merge /*+index(t1 emp_pk_t)*/ intoemp_target t1
using (select /*+index(emp_sourceemp_pk)*/ * from emp_source) t2
on (t1.empno= t2.empno)
when matched then
update set t1.ename='U_'||t2.ename,
t1.sal=t1.sal+t2.sal
when not matched then
insert (T1.EMPNO,T1.ENAME,T1.SAL)
values(T2.EMPNO,'I_'||T2.ENAME,T2.SAL)
select * from table(dbms_xplan.display());
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | MERGE STATEMENT | | 9000 | 966K| 5264 (1)| 00:01:04 |
| 1 | MERGE | EMP_TARGET | | | | |
| 2 | VIEW | | | | | |
|* 3 | HASH JOIN OUTER | | 9000 | 413K| 5264 (1)| 00:01:04 |
| 4 | TABLE ACCESS FULL | EMP_SOURCE | 9000 | 202K| 11 (0)| 00:00:01|
| 5 | TABLE ACCESS BY INDEX ROWID| EMP_TARGET| 900K| 20M| 5249 (1)| 00:01:03 |
| 6 | INDEXFULL SCAN | EMP_PK_T | 900K| | 2015 (1)| 00:00:25 |
---------------------------------------------------------------------------------------------
explain planfor
merge into emp_target t1
using (select /*+cardinality(emp_source 10)*/* from emp_source ) t2
on (t1.empno= t2.empno)
when matched then
update set t1.ename='U_'||t2.ename,
t1.sal=t1.sal+t2.sal
when not matched then
insert (T1.EMPNO,T1.ENAME,T1.SAL)
values (T2.EMPNO,'I_'||T2.ENAME,T2.SAL)
select * fromtable(dbms_xplan.display());
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | MERGE STATEMENT | | 10 | 1100 | 41 (0)| 00:00:01 |
| 1 | MERGE | EMP_TARGET | | | | |
| 2 | VIEW | | | | | |
| 3 | NESTED LOOPS OUTER | | 10 | 470 | 41 (0)| 00:00:01 |
| 4 | TABLE ACCESS FULL | EMP_SOURCE | 10 | 230 | 11 (0)| 00:00:01|
| 5 | TABLE ACCESS BY INDEX ROWID| EMP_TARGET| 1 | 24 | 3 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | EMP_PK_T | 1 | | 2 (0)| 00:00:01|
---------------------------------------------------------------------------------------------
好像这是一个oracle数据库的bug
用跟踪文件,来进行比较
merge intoemp_target t1
using(select * from emp_source ) t2
on (t1.empno= t2.empno)
when matchedthen
update set t1.ename='U_'||t2.ename,
t1.sal=t1.sal+t2.sal
when not matchedthen
insert(T1.EMPNO,T1.ENAME,T1.SAL)
values(T2.EMPNO,'I_'||T2.ENAME,T2.SAL)
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.03 0.02 0 0 0 0
Execute 1 0.65 0.94 3233 3313 9273 9000
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 0.68 0.97 3233 3313 9273 9000
Misses inlibrary cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 114
Rows RowSource Operation
------- ---------------------------------------------------
2 MERGE EMP_TARGET (cr=3313 pr=3233 pw=3233 time=2 us)
9000 VIEW (cr=3313 pr=3233 pw=3233 time=939 us)
9000 HASH JOIN OUTER (cr=3313 pr=3233 pw=3233time=678 us cost=911 size=423000 card=9000)
9000 TABLE ACCESS FULL EMP_SOURCE (cr=38 pr=0pw=0 time=138 us cost=11 size=207000 card=9000)
900009 TABLE ACCESS FULL EMP_TARGET (cr=3275pr=3233 pw=3233 time=18192 us cost=896 size=21600216 card=900009)
********************************************************************************
merge intoemp_target t1
using (select/*+cardinality(emp_source 10)*/ * from emp_source ) t2
on (t1.empno= t2.empno)
when matchedthen
update set t1.ename='U_'||t2.ename,
t1.sal=t1.sal+t2.sal
when not matchedthen
insert(T1.EMPNO,T1.ENAME,T1.SAL)
values(T2.EMPNO,'I_'||T2.ENAME,T2.SAL)
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.03 0.02 0 0 0 0
Execute 1 0.48 0.37 0 431 9273 9000
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 0.51 0.39 0 431 9273 9000
Misses inlibrary cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 114
Rows RowSource Operation
------- ---------------------------------------------------
2 MERGE EMP_TARGET (cr=431 pr=0 pw=0 time=2 us)
9000 VIEW (cr=431 pr=0 pw=0 time=2613 us)
9000 NESTED LOOPS OUTER (cr=431 pr=0 pw=0time=2335 us cost=41 size=470 card=10)
9000 TABLE ACCESS FULL EMP_SOURCE (cr=38 pr=0pw=0 time=204 us cost=11 size=230 card=10)
9000 TABLE ACCESS BY INDEX ROWID EMP_TARGET(cr=393 pr=0 pw=0 time=0 us cost=3 size=24 card=1)
9000 INDEX RANGE SCAN EMP_PK_T (cr=361 pr=0pw=0 time=0 us cost=2 size=0 card=1)(object id 82893)
********************************************************************************
实际上刚才那个操作相当于
update /*111*/ emp_target a
set a.ename=(select ename from emp_source where empno=a.empno)
where empno in (select empno from emp_source)
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.82 0.98 1669 21467 9222 9000
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 0.82 0.98 1669 21467 9222 9000
Misses inlibrary cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 114
Rows RowSource Operation
------- ---------------------------------------------------
0 UPDATE EMP_TARGET (cr=21471 pr=1669 pw=1669 time=0 us)
9000 HASH JOIN (cr=3299 pr=1669pw=1669 time=751 us cost=906 size=216000 card=9000)
9000 INDEX FAST FULL SCAN EMP_PK (cr=24 pr=0pw=0 time=136 us cost=6 size=27000 card=9000)(object id 82884)
900009 TABLE ACCESS FULL EMP_TARGET (cr=3275 pr=1669pw=1669 time=18127 us cost=895 size=18900189 card=900009)
9000 TABLE ACCESS BY INDEX ROWID EMP_SOURCE(cr=18167 pr=0 pw=0 time=0 us cost=2 size=20 card=1)
9000 INDEX UNIQUE SCAN EMP_PK (cr=9167 pr=0 pw=0time=0 us cost=1 size=0 card=1)(object id 82884)
selectempno,ename from emp_target where empno<9001
minus
selectempno,ename from emp_source
-----
null
selectempno,ename from emp_source
minus
select empno,ename from emp_target where empno<9001
-----
null
update
(
select a.ename s1, b.ename s2
from emp_source a, emp_target b
where a.empno=b.empno and
b.empno<9001
) r
set r.s2=r.s1
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.10 0.11 0 92 9222 9000
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 0.10 0.11 0 92 9222 9000
Misses in library cacheduring parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 114
Rows Row Source Operation
------- ---------------------------------------------------
0 UPDATE EMP_TARGET (cr=96 pr=0 pw=0 time=0 us)
9000 HASH JOIN (cr=91 pr=0 pw=0 time=1105 us cost=68 size=368959 card=8999)
9000 TABLE ACCESS FULL EMP_SOURCE (cr=38 pr=0pw=0 time=144 us cost=11 size=180000 card=9000)
9000 TABLE ACCESS BY INDEX ROWID EMP_TARGET(cr=53 pr=0 pw=0 time=485 us cost=56 size=189000 card=9000)
9000 INDEX RANGE SCAN EMP_PK_T (cr=21 pr=0 pw=0time=172 us cost=23 size=0 card=9000)(object id 82893)
update
(
select /*+cardinality(a 10)*/ a.ename s1, b.ename s2
from emp_source a, emp_target b
where a.empno=b.empno
--and
--b.empno<9001
) r
set r.s2=r.s1
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.26 0.26 0 431 150 9000
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 0.26 0.26 0 431 150 9000
Misses in library cacheduring parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 114
Rows Row Source Operation
------- ---------------------------------------------------
0 UPDATE EMP_TARGET (cr=431 pr=0 pw=0 time=0 us)
9000 NESTED LOOPS (cr=431 pr=0 pw=0 time=2542 us)
9000 NESTED LOOPS (cr=399 pr=0 pw=0 time=1402 us cost=41size=410 card=10)
9000 TABLE ACCESS FULL EMP_SOURCE (cr=38 pr=0pw=0 time=178 us cost=11 size=200 card=10)
9000 INDEX RANGE SCAN EMP_PK_T (cr=361 pr=0pw=0 time=0 us cost=2 size=0 card=1)(object id 82893)
9000 TABLE ACCESS BY INDEX ROWID EMP_TARGET(cr=32 pr=0 pw=0 time=0 us cost=3 size=21 card=1)
集合的merge操作
我们已经知道merge操作的机制,即如果关联不上,说明源表的那条数据,在目标表中不存在,我们将数据从源表提出“直接”插入到目标表,如果能够关联上,说明关联上的记录已经在目标表中存在,只是部分信息(关联字段以外的其它字段中的部分信息已经改变)需要用源表的最新信息来更新。这里“直接”的意思是“不需要在引擎与pl/sql引擎之间转化,因为那样需要更多的成本”。
由于存在“直接”的机制,所以oracle建议,“有则更新,没有就插入”这样的操作尽量采用merge操作。
但是问题随之而来,我们知道merge操作是由insert与update2种方式组成的,这2种方式的操作,对索引的要求不一样,考虑到以下情况:
1、 如果目标表有1亿条记录,源表有10万记录;其中99999条记录要插入,只有一条数据需要update,那目标表上的索引不是成了负担;
2、 如果我们更需要个性化的规则,如f1(关联字段),与f2一个非关联字段,如果这两个字段不变,则认为这条记录没有变化,即无需插入,也无需更新;
3、 如果这个merge产生太大的undo,我们希望将真个工作划分多个小的工作,当小的工作完成的时候,立即提交;
4、 自定义多进程;
这时候集合操作方法也许更有利:
操作方法:
create table emp_target
as
select *
from scott.emp
where mod(empno,2) =0;
update emp_targetset sal = 0;
commit;
create type emp_recas object (
EMPNO NUMBER(4),
ENAME VARCHAR2(10),
JOB VARCHAR2(9),
MGR NUMBER(4),
HIREDATE DATE,
SAL NUMBER(7,2),
COMM NUMBER(7,2),
DEPTNO NUMBER(2) )
/
建立pl/sqltable;
create typeemp_array as table of emp_rec;
declare
type array is table of emp%rowtype index bybinary_integer;
l_rec array;
l_data emp_array;
begin
selectempno,ename,job,mgr,hiredate,sal,comm,deptno bulk collect
into l_rec
from emp;
l_data := emp_array();
l_data.extend( l_rec.count );
for i in 1 .. l_rec.count
loop
-- process record
l_data(i) := emp_rec(l_rec(i).empno, l_rec(i).ename, l_rec(i).job,
l_rec(i).mgr, l_rec(i).hiredate, l_rec(i).sal,
l_rec(i).comm, l_rec(i).deptno );
end loop;
merge into EMP e1
using (select * fromTABLE(cast(l_data as emp_array)) ) e2
on ( e2.empno = e1.empno )
when matched then
update set e1.sal = e2.sal
when not matched then
insert ( empno, ename, job, mgr,hiredate, sal, comm, deptno )
values ( e2.empno, e2.ename, e2.job,e2.mgr, e2.hiredate, e2.sal, e2.comm, e2.deptno );
end;
select ename, salfrom emp;
ENAME SAL
--------------------
JONES 2975
MARTIN 1250
BLAKE 2850
CLARK 2450
SCOTT 3000
TURNER 1500
ADAMS 1100
JAMES 950
FORD 3000
MILLER 1300
KING 5000
WARD 1250
ALLEN 1600
SMITH 800
如何观察一个merge有多少数据插入,有多少数据更新
在merge操作中,
create table t as select * fromall_users where 1=0;
begin
merge into t usingall_users on (t.user_id = all_users.user_id)
when matched
then update setusername = all_users.username
when not matchedthen
insert(user_id,username,created)
values (all_users.user_id,all_users.username,all_users.created);
dbms_output.put_line( sql%rowcount|| ' rows mergeds' );
end;
多次运行过程,观看output,发现sql%rowcount总为36。
select count(*) from all_users
以上查询结果也是36,所以说merge操作,sql%rowcount为数据源表的条数。
有什么方法,能够看出一个merge操作,插入多少条,又更新了多少条记录那?
编写一个包含全局变量的包
create orreplace package showMergeExample AS
tableCountNUMBER :=0; --全局变量
FUNCTIONget_row_inserted return number;
proceduresettableCount;
END ;
CREATE OR REPLACE package BODY showMergeExample IS
FUNCTIONget_row_inserted --当过程被调用一次 tableCount就增加1
RETURNNUMBER
AS
BEGIN
tableCount:=nvl(tableCount,0)+1;
RETURN 0;
EXCEPTIONWHEN OTHERS THEN
RETURN 0;
dbms_output.put_line(tableCount);
END;
procedure settableCount as -- tableCount被请0
begin
tableCount:=0;
end;
end;
下面一个过程
begin
showmergeexample.settablecount;--计数器被清0
dbms_output.put_line(showmergeexample.tablecount );
merge into t
using all_users
on (t.user_id = all_users.user_id)
when matched then
update
set username = all_users.username
when not matched then
insert (user_id, username, created)
values (all_users.user_id +showmergeexample .get_row_inserted,
all_users.username,all_users.created);
--当执行插入的时候,showmergeexample.get_row_inserted被执行一次 计数器就增加1
dbms_output.put_line(showmergeexample.tablecount );
dbms_output.put_line(showmergeexample.tablecount || ' rows inserted ');
dbms_output.put_line ((sql%rowcount-showmergeexample.tablecount) || ' rows updated ');
end;
输出结果:
0
24
24 rows Inserted
12 rows Updated
原创文章,如果转载,请标注作者:田文 CSDN地址:http://blog.csdn.net/tiwen818