You Asked (Jump to Tom's latest followup)
Good Morning Tom.
I need your expertise in this regard. I got a table which contains millions or
records. I want to update and commit every time for so many records ( say
10,000 records). I dont want to do in one stroke as I may end up in Rollback
segment issue(s). Any suggestions please ! ! !
Murali
and we said...
If I had to update millions of records I would probably opt to NOT update.
I would more likely do:
CREATE TABLE new_table as select
index new_table
grant on new table
add constraints on new_table
etc on new_table
drop table old_table
rename new_table to old_table;
you can do that using parallel query, with nologging on most operations
generating very little redo and no undo at all -- in a fraction of the time it
would take to update the data.
Reviews
GOTO a page to Bookmark Review | Bottom | Top
updating millions of records November 11, 2002
Reviewer: Om from India
Hi Tom,
Could you please elaborate
CREATE TABLE new_table as select
the above statement with a suitable example please
Followup:
ok, say you wanted to update emp to set ename = lower(ename). Instead, you
could do this:
[email protected]> create table new_emp as
2 select empno, LOWER(ename) ename, JOB,
3 MGR, HIREDATE, SAL, COMM, DEPTNO
4 from emp;
Table created.
[email protected]>
[email protected]> drop table emp;
Table dropped.
[email protected]> rename new_emp to emp;
Table renamed.
GOTO a page to Bookmark Review | Bottom | Top
Million records update November 11, 2002
Reviewer: Ramesh G from U.K
What if a table has over 100 million records and if i only want to update 1
million? If your method is still applicable could you elaborate it.
Many thanks in advance.
Followup:
most likely -- yes. I don't have a 100million row table to test with for you
but -- the amount of work required to update 1,000,000 indexed rows is pretty
large. Fortunately, you are probably using partitioning so you can do this
easily in parallel -- bit by bit.
GOTO a page to Bookmark Review | Bottom | Top
"How to Update millions or records in a table", version 8.1.7 November 11, 2002
Reviewer: John Bittner from Hunt Valley, MD USA
This is absolutely a viable approach, and one we have used repeatedly. One of
our apps updates a table of several hundred million records.
The cursor..For loop approach for the update was calculated to take 53.7 years
to complete!
We institued the Insert into a dummy table append with nologging, and were able
to complete the "update" in under 30 minutes.
With nologging, if the system aborts, you simply re-run the 'update' again, as
you have the original data in the main table. When done, we swap the partition
of original data with the 'dummy' table (the one containing new values),
rebuild indexes in parallel, and wha-la! Our update is complete.
i.e, to update field2 in a table:
1) First create your dummy hold table: create table xyz_HOLD as select * from
xyz where rownum<1. Alter tablexyz nologging.
2) insert /*+ append parallel (xyzhold,12) */ into xyz_hold xyzhold (field1,
field2, field3) select /*+ parallel (x,12) */ xyz.field1,
my_new_value_for_field2, xyz.field3 from xyz x where blah blah blah.
3) when done, either rename the table, or swap the partition if your original
table is partitioned, and you only updated one partition as we do. Obviously
you need to rebuild indecies, etc as required.
Hope this helps!
GOTO a page to Bookmark Review | Bottom | Top
updating millions of records November 11, 2002
Reviewer: A.Ashiq from Trichy,Tamil Nadu ,India
Hi Tom,
As u suggested us to create a new table ,then drop the original table and rename
the new table to original table instead of updating a table with millions of
records.But what happen to dependent objects ,everything will get
invalidated.Yeah ,of course it'll recompile itself when it called next time.But
again it(dependent objects) has to do parsing.Is it Ok.
Followup:
It is OK.
GOTO a page to Bookmark Review | Bottom | Top
in case of delete November 12, 2002
Reviewer: A reader
We've a similar situation., We delete around 3 million records from 30 million
rows table everyday.
There is no logical column to do partition.,
I guess the insert into a new table will take considerable time with 27 mil
records.. Please let me know what is the best approach.
Followup:
wait 10 days so that you are deleting 30 million records from a 60 million
record table and then this will be much more efficient.
Time it some day though. 3 million records on an indexed table will take
considerable time. There is a chance that INSERT /*+ append */ select
delete.
GOTO a page to Bookmark Review | Bottom | Top
November 12, 2002
Reviewer: A reader
Tom,
Recently I had conducted a interview in which one the dba mentioned that they
had a table that might conatin 10 million records or might be 1 million. He
meant to say they delete the records and some time later the table will be
populated again and viceversa.
Tom according to you do you consider partitions for such tables and if yes which
type of partition..
Thanks.
Followup:
hard to tell -- is the data deleted by something that is relatively constant
(eg: the value in that column doesn't change - so the row doesn't need to move
from partition to partition). If so -- sure, cause we could just drop
partitions (fast) instead of deleting the data.
GOTO a page to Bookmark Review | Bottom | Top
Agree with John - insert append / CTAS / partition swap is the only way to fly November 12, 2002
Reviewer: Jack Silvey from Richardson, TX
I work with John Bittner, one of the previous reviewers. I second what he said
absolutely. It is the only way to fly. We also have an absolutely incredible
stored procedure that rebuilds all of our indexes concurrently after the load,
using the Oracle job scheduler as the mechanism of allowing separate threads in
pl/sql.
GOTO a page to Bookmark Review | Bottom | Top
addendum November 13, 2002
Reviewer: A reader
This process was introduced to our environment by a master tuner and personal
friend, Larry Elkins. This was a totally new paradigm for the application, and
one that saved the entire mission-critical application. The in-place updates
would not have worked with the terrabytes of data that we have in our database.
GOTO a page to Bookmark Review | Bottom | Top
How to Update millions or records in a table November 19, 2002
Reviewer: Boris Milrud from San Jose, CA USA
In response to the Jack Silvey (from Richardson, TX ) review, where he wrote "It
is the only way to fly. We also have an absolutely incredible stored procedure
that rebuilds all of our indexes concurrently after the load, using the Oracle
job scheduler as the mechanism of allowing separate threads in pl/sql":
Could you provide more information about that procedure and how to rebuild
multiple same-table indexes concurrently using Oracle job scheduler?
Thanks,
Boris.
Followup:
instead of
begin
execute immediate 'alter index idx1 rebuild';
execute immediate 'alter index idx2 rebuild';
end;
you can code:
declare
l_job number;
begin
dbms_job.submit( l_job, 'execute immediate ''alter index idx1 rebuild'';' );
commit;
dbms_job.submit( l_job, 'execute immediate ''alter index idx2 rebuild'';' );
commit;
end;
Now, just set job_queue_processes > 0 to set the "degree of threads" and in 8i
and before set job_queue_interval to say 60 or so and there you go.
GOTO a page to Bookmark Review | Bottom | Top
How to Update millions or records in a table November 19, 2002
Reviewer: Boris Milrud from San Jose, CA USA
Thanks, Tom.
Your status said that you had a large backlog, so I decided not to wait for your
response and tried myself using dbms_job.submit() calls. At first, it did not
work (job_queue_processes was 0), but after I set it to 12 it started working.
The only one difference between your code and mine is that I issue just one
commit at the end. It should not matter, right?
I selected 1 mln. rows table and rebuild 5 non-partitioned indexes with 'compute
statistics parallel nologging' clause.
Here is the numbers I've got: rebuilding indexes sequentually consistently took
76 sec., while using dbms_job.submit() calls took around 40 - 42 sec.
I said "around", because the technique I used may not be perfect, though it
served the purpose. I recorded the time right after the commit statement at the
end of PL/SQL block - that's the start time. Then I kept querying user_jobs view
every 2 - 3 sec, until the last of the 5 jobs were gone. That's was the end
time.
The last question on this topic: is user_jobs view is the right place to look in
order to determine that's rebuilding is done and how long it took? In package I
am writing, I do massive delete operation, then rebuilding indexes, then
starting the next routine. What would be the best way to detect the end of
rebuiding, in order to proceed with the next call?
Thanks.
Boris.
Followup:
You can use user_jobs or dba_jobs but -- you might just want to put some
"logging" into your jobs themselves so you can monitor their progress and record
their times.
GOTO a page to Bookmark Review | Bottom | Top
You can even rebuild all the indexes in a partition simultaneously November 21, 2002
Reviewer: ramakrishna from India
Thanks Tom,
In our environment, we have partitioned tables and we use:
ALTER TABLE table_name MODIFY PARTITION part_name REBUILD UNUSABLE LOCAL INDEXES
and this rebuilds all the indexes in the partition at one shot. We found this
much faster than doing the indexes one by one. We will now try out if we can
submit multiple such jobs in parallel (one for each partition of the table).
regards
ramakrishna
GOTO a page to Bookmark Review | Bottom | Top
in case of deletes November 29, 2002
Reviewer: carl from moondog, mo
Hi Tom,
Thanks so much for your web site and help. It is our
number 1 reference in times of fear and loathing.
This is what we came up with concerning mass updates
INV 50M
INVINS 10M
INVDEL 7M
There are indexes on INV.KEY and INVDEL.KEY
Execution Plan ( for deletes and updates )
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 FILTER
2 1 TABLE ACCESS (FULL) OF 'INV'
3 1 INDEX (RANGE SCAN) OF 'IX_INVDEL01' (NON-UNIQUE)
alter table INVTMP nologging;
-- INVINS contains delta inserts and updates
insert /*+ APPEND */
into INVTMP
select *
from INVINS t1;
-- INVDEL contains delta deletes and updates
insert /*+ APPEND */
into INVTMP
select *
from INV t1
where not exists ( select null
from INVDEL t2
where t2.KEY = t1.KEY );
alter table INVTMP logging;
drop table INV;
rename INVTMP to INV
-- build indexs etc
This is what we came up with and is to the fastest
approach we've tested.
Any comments or suggestions are welcome and appreciated.
Followup:
I might try an anti-join on the NOT EXISTS and doing the work in a single
parallel CTAS
create table invtmp as
select * from invins
UNION ALL
select t1.*
from inv t1, invdel t2
where t1.key = t2.key(+)
and t2.key is null;
add parallel /nologging as needed.
GOTO a page to Bookmark Review | Bottom | Top
in case of deletes - many thanks December 01, 2002
Reviewer: carl from moondog, mo
Ran the test cases at home 500K rows in INV and 50K rows in
INVINS and INVDEL
My way: 16.33 secs
Your way: 08.35 secs
Many thanks.
GOTO a page to Bookmark Review | Bottom | Top
A bit confused March 17, 2003
Reviewer: Choche from England
I was just wandering that none of the reviews made mention if these techniques
could be applied in a multi-user environment where multiple users could be
updating the same table at the same time.
Followup:
sorry -- I though it obvious that in most cases "no" is the answer. we are
copying the data or locking excessive amounts of it -- or disabling indexes and
the like.
This is a "batch" process here, to update millions of records.
GOTO a page to Bookmark Review | Bottom | Top
Replacing PK values July 25, 2003
Reviewer: Michael from San Diego, CA, USA
We need to replace the main ID for all entries in the database which is the PK
in about 100 tables, largest of them have about 2 mln rows. New and old values
for ID are stored in the lookup table, and there about half a million of them.
The original approach was to replace PK in all tables for one ID value, commit
the changes and move on to the next ID. This way referential integrity would be
guaranteed even if update fails at any stage, and we do not need a long rollback
segments and can restart the process at the point it aborted. But this update is
too lengthy, and in our case would take almost a month of continuous processing.
Do you have any suggestions on how to improve the performance of this critical
update? Would it be better to add new ID as an additional column into one table
at a time and populate all of it at once as "create table ... as select" (with
nologging option), and then rename table and drop old ID column?
Followup:
sounds like I would:
create table new_table as select
create table new_table2 as select
etc etc etc etc
drop old tables
add indexes
add primary/fkeys/constraints etc....
that is -- rebuild in bulk, using nologging (no redo, no undo generated), using
parallel when appropriate.
GOTO a page to Bookmark Review | Bottom | Top
Update million Rows July 26, 2003
Reviewer: Reaz from Dhaka, Bangladesh
Dear Tom,
We have this situation, where we load data from external files numbering 7.
Next we had to run a transformation process, which checks if the row exists in
the target table and updates the existing row with the new one. Otherwise
inserts the row to the target table.
Our current process is very slow, sometimes taking a night long to complete for
say 200,000 rows.
The procedures discussed so far meant of updating only, but in our case we need
to update and insert too.
I would appreciate your suggestion.
Thanks.
Reaz.
Followup:
read about MERGE in the sql reference guide.
GOTO a page to Bookmark Review | Bottom | Top
A single batch? July 26, 2003
Reviewer: A reader
Tom,
Is it still okay, if these steps (create a new table, drop the old one, and them
rename the new table as old, create indexes etc.) are coded as the batch update
program? OR would you suggests these should be detailed as the Steps that should
be performed under the supervision of a DBA? What I would like to know is that
can we still have the steps in a single batch routine (as update would have had)
or instead of the single batch these should be DBA instructions instead?
Thanks
Followup:
I would have these performed under the watchful eye of someone.
it is a one time bulk manipulation. it won't be a "pretty program" with tons of
gracefully logging/error handling. You'll be doing it "down and dirty" (at
least I would be). It would need someone to watch over it.
GOTO a page to Bookmark Review | Bottom | Top
What about space consideration July 28, 2003
Reviewer: A reader
Hi Tom,
I had a similar problem where my table had gone big and i wanted to truncate
some old data, my table was big more then 20 million records, i could not use
the create temp table truncate and insert flow as the space of the database was
not enough to hold 2 copies of the large table, I thought of doing a partition
on table and then truncate one partition of table but that does not work.
Can you suggest some way which caters for this space constraint also
Followup:
why didn't the partitioning work for you.
GOTO a page to Bookmark Review | Bottom | Top
How to update part of a Very Large Table July 28, 2003
Reviewer: Mohan K from Bangalore, India
The update operation can be made simple by sepearting out the query and update
operation. To do that create another table which stores the rowids of the
records in the original table which has to be updated along with the the value
to be updated. Then run the PL/SQL script to update the records one by one. To
test the above method the following scripts can be used.
SQL> create table test_update(id number, name varchar2(100), description
varchar2(4000)) storage initial(48M next 4m);
SQL> declare
v_n number;
v_name number;
v_desc number;
i number;
begin
for i in 1..1000000 LOOP
insert into test_update(id, name, description) values(i, 'Test Name'||i, 'Test
Name '||i||' description ');
END LOOP;
end;
/
Elapsed: 00:04:277.23
The above script will insert 1 million rows.
SQL> select count(*) from test_update where description like '%5 description%';
COUNT(*)
----------
100000
Elapsed: 00:00:02.63
SQL> create table test_update_rowids as select rowid rid, description from
test_update where description like '%5 description%';
Elapsed: 00:00:54.58
The table test_update_rowids stores the rowids and the new values that has to be
updated. ie 100000 rows needs to be updated.
SQL> declare
begin
for c1 in(select rid, description from test_update_rowids)
LOOP
update test_update set description=c1.description||' after update' where
rowid=c1.rid;
END LOOP;
end;
/
Elapsed: 00:01:82.17
The above script performs the updation.
Mohan
Followup:
the above script performs an "updation" in the worst possible way, consuming the
most resources.
it should have been simply:
update test_udpate set description = description || ' after update ' where
description like '%5 description%';
Never do procedurally that which can be done in a single sql statement.
GOTO a page to Bookmark Review | Bottom | Top
Updating 100K of 1 Million rows July 28, 2003
Reviewer: Mohan K from Bangalore, India
I ran the update statement as a single SQL statement and the time elapsed is
slightly more than the above method. It has to be tested against update
statements containing complex queries or join operations.
SQL> update test_update set description = description || ' after update ' where
description like '%5 description%';
Elapsed: 00:01:100.13
Mohan
Followup:
trust me -- that difference is neglible and the amount of time you spent writing
code wasn't included.
Also, you FORGOT to add in the time to populate your temporary table -- no?
if you can do it in a single sql statment (and you almost always can) you
SHOULD.
Lets do you benchmark a little better:
ops$tkyte@ORA920> create table test_update(id number, name varchar2(100),
description varchar2(4000));
Table created.
ops$tkyte@ORA920>
ops$tkyte@ORA920> insert /*+ append */
2 into test_update
3 select rownum, 'Test Name' || rownum, 'Test Name ' || rownum || '
description '
4 from big_table.big_table
5 where rownum <= 1000000;
1000000 rows created.
ops$tkyte@ORA920> commit;
Commit complete.
ops$tkyte@ORA920>
ops$tkyte@ORA920> select count(*)
2 from test_update
3 where description like '%5 desc%' ;
COUNT(*)
----------
100000
ops$tkyte@ORA920>
ops$tkyte@ORA920> select count(*)
2 from test_update
3 where description like '%6 desc%' ;
COUNT(*)
----------
100000
ops$tkyte@ORA920>
ops$tkyte@ORA920> exec runstats_pkg.rs_start
PL/SQL procedure successfully completed.
ops$tkyte@ORA920>
ops$tkyte@ORA920> update test_update
2 set description = description || ' after update'
3 where description like '%6 description%';
100000 rows updated.
ops$tkyte@ORA920>
ops$tkyte@ORA920> exec runstats_pkg.rs_middle
PL/SQL procedure successfully completed.
ops$tkyte@ORA920>
ops$tkyte@ORA920> create table test_update_rowids
2 as
3 select rowid rid, description from test_update where description like '%5
description%';
Table created.
ops$tkyte@ORA920>
ops$tkyte@ORA920> begin
2 for c1 in(select rid, description from test_update_rowids)
3 LOOP
4 update test_update
5 set description=c1.description||' after update'
6 where rowid=c1.rid;
7 END LOOP;
8 end;
9 /
PL/SQL procedure successfully completed.
ops$tkyte@ORA920>
ops$tkyte@ORA920>
ops$tkyte@ORA920> exec runstats_pkg.rs_stop
Run1 ran in 1376 hsecs
Run2 ran in 9806 hsecs
run 1 ran in 14.03% of the time
so, according to me, the SINGLE (efficient, effective, correct way) update
statment runs in about 14% of the time. furthermore:
Name Run1 Run2 Diff
STAT...recursive cpu usage 5 4,662 4,657
STAT...CPU used by this sessio 448 5,397 4,949
only 4,949 MORE cpu seconds using the row by row approach...
STAT...CPU used when call star 448 5,397 4,949
STAT...free buffer inspected 10,162 17,039 6,877
STAT...db block gets 111,705 104,450 -7,255
STAT...dirty buffers inspected 9,016 16,620 7,604
STAT...calls to kcmgas 7,755 68 -7,687
STAT...Cached Commit SCN refer 7,718 0 -7,718
STAT...switch current to new b 7,719 0 -7,719
STAT...cleanouts only - consis 0 7,720 7,720
STAT...immediate (CR) block cl 0 7,720 7,720
STAT...commit txn count during 0 7,721 7,721
STAT...cleanout - number of kt 3 7,726 7,723
STAT...consistent gets - exami 3 7,760 7,757
LATCH.undo global data 221 7,985 7,764
STAT...redo entries 100,700 108,727 8,027
LATCH.redo allocation 100,795 108,945 8,150
STAT...db block changes 202,723 210,907 8,184
STAT...Elapsed Time 1,385 9,825 8,440
STAT...physical reads 7,894 16,476 8,582
LATCH.checkpoint queue latch 26,845 38,012 11,167
LATCH.cache buffers lru chain 35,638 52,827 17,189
STAT...no work - consistent re 7,741 100,072 92,331
STAT...table scan blocks gotte 7,719 107,721 100,002
STAT...table scan rows gotten 1,000,000 1,100,002 100,002
STAT...execute count 27 100,092 100,065
STAT...buffer is not pinned co 7,763 107,836 100,073
STAT...calls to get snapshot s 56 100,241 100,185
LATCH.shared pool 172 100,778 100,606
STAT...session logical reads 119,503 220,165 100,662
STAT...consistent gets 7,798 115,715 107,917
LATCH.library cache pin 233 200,865 200,632
STAT...recursive calls 177 200,870 200,693
LATCH.library cache 332 201,322 200,990
LATCH.cache buffers chains 587,377 835,167 247,790
STAT...redo size 31,323,804 31,836,904 513,100
STAT...session pga memory max 0 524,824 524,824
STAT...session pga memory 0 1,049,112 1,049,112
Run1 latches total versus runs -- difference and pct
Run1 Run2 Diff Pct
754,602 1,553,986 799,384 48.56%
PL/SQL procedure successfully completed.
and worse, it used 2x the number of latches! (scalability inhibitors)
There is nothing good about row at a time processing, it is something to do ONLY
when a set based operation isn't feasible
GOTO a page to Bookmark Review | Bottom | Top
This is very good example for updating the records July 28, 2003
Reviewer: Bipin Ganar from INDIA
Hi Tom,
Yes this is good example to update the record but my problem
is i have 1Million records in one table. I want update and give hike of 10% of
their salaries in one table but that table is accessed by n no of people at the
same moment. I can not drop the table and update and second how can i define
tablespace in this script
Create Table a as select * from emp_sal;
Followup:
create table a
tablespace foo
as
select * from emp_sal;
GOTO a page to Bookmark Review | Bottom | Top
slightly different problem! July 28, 2003
Reviewer: Srikanth Adiga from Pune, India
My table:
5M records are inserted each month.
Operations:
5M records inserted on the first day of the month
5M records updated on rest of the days on the month.
Note that this update is update each row with diff values.
I am using OCI and using batch operations. Can you please suggest me what else
can I do within my program to improve the performance of my updates? I tried
multithreading, but that did not help.
anything else?
Followup:
it only takes a couple of minutes to update 5 million rows on my desktop pc --
so, there must be something more involved here.
I'd be able to do that in minutes -- why is it taking you so long -- what is
your bottleneck.
GOTO a page to Bookmark Review | Bottom | Top
space considerations July 28, 2003
Reviewer: A reader
could not create another temp table which holds the data same as teh table being
partitioned as there was not enuff space in the harddisk to hold 2 or 1.5 to be
precise copies of the same huge table,is there a way to partition without having
to drop of truncate or transfer data to some temp location and then partition
Followup:
disk is cheap.
time is money.
your time spent just thinking about this problem cost much more then the disk to
solve it would.
penny wise, pound foolish. How big is this 20million row table (assuming you
are the same "a reader" from above). Maybe 1-3 gig?
GOTO a page to Bookmark Review | Bottom | Top
slightly different problem! July 29, 2003
Reviewer: Srikanth Adiga from Pune,India
couple of minutes!!! Oh dear, thats really too good.
Let me reitterate my case:
My table schema is something like this:
SubscriberVsQtyTable:
int client-id
int year
int month
int quantity
primary key - (client-id, year,month)
This table has 5M client ids.
Now every day, I have update the quantity field with the txns made for that day.
Note that, I have to update one subscriber at a time since the number of txns
made by each subscriber if different.
So, even now do you believe you can do this in couple of minutes? My app takes
about 30 to 50 mins even on higher machines like 8 CPU.
As I said, this is a C++ app, running Solaris + Oracle 8i/9i. We are using OCI.
Thanks,
Followup:
if you are doing that work with more then a single sql statement, I could see it
taking a while.
If you do that work in a single update, no, I don't see it taking more then a
couple of minutes.
GOTO a page to Bookmark Review | Bottom | Top
Time for Update July 29, 2003
Reviewer: Mohan K from Bangalore, India
It is my understanding that the time for update depends on the following
factors. There may be something else also involved.
1. Row Chaining/Migration caused by update
2. The where cluase is used to select the rows to be updated.
3. Percentage rows updated in a block. The more the better. Even if one row is
updated in a block containing 10 records the entire block has to written by DBWR
and LGWR.
So just the number of records is not that matters. 100 million recorsd may takes
days for one table and may take less than a hour for another table with few
columns.
Mohan
Followup:
1) updating a quantity (number) field is unlikely to do that.
2) correct
3) correct
no, i disagree with the last part. to update 5million rows, a single unindexed
column, should not take "a really long time". I was answering their direct
question -- they update every row, on every block. so #1 and #3 really don't
come into play. The biggest factor is #2 here
GOTO a page to Bookmark Review | Bottom | Top
Verifing if neccesary to update, before update July 29, 2003
Reviewer: juan carlos reyes from Bolivia
Hi Tom, I had seen several times that you can
increase the performance by evaluating if you really need to update, or maybe it
was only my impression.
For example
Update table set income = 500
where level = 7
AND NOT INCOME = 500
The AND NOT INCOME = 500, allows to avoid unncesary updates,
my point is a question
How advisable is it, does it really increase performance?
As you are always thinking how to increase performance, tt could be a good idea
in newer version of Oracle, that if Oracle verifies that the value was set to
that value, not update
:)
Followup:
if there are more then a handful of records where income = 500, then yes, this
will definitely increase performance by
a) decreasing the latching
b) decreasing the redo
c) decreasing the undo
we cannot make that optimization for you -- it would stop triggers and other
EXPECTED side effects from happening.
GOTO a page to Bookmark Review | Bottom | Top
July 29, 2003
Reviewer: A reader
:) Thanks,
Maybe then it could be an optional feature
UPDATE_ONLY_IF_DIFFERENT hint
I think it could interest several people
who don't use trigger, neither constraints ,etc.
GOTO a page to Bookmark Review | Bottom | Top
July 29, 2003
Reviewer: Stu Charlton from Toronto Canada
"Note that, I have to update one subscriber at a time since the number of txns
made by each subscriber if different."
That still seems doable in 1 SQL statement, even if you wanted to merge the
updates. Couldn't you just do a COUNT(*) grouped by client_id, year, and month?
Followup:
I agree -- most things CAN be done in a single sql statement.
GOTO a page to Bookmark Review | Bottom | Top
easier said then done in real life situations July 29, 2003
Reviewer: A reader
Hi Tom,
I agree you may think getting a disk is cheaper, that is only when u are
considering getting up or just going to the store and getting a disk,
In real life conditions specially corporates this is not as easy as it seems nor
is it less costly.
A Simple purchase of disk has to first be justified
which is quite difficuult, i dont think i have justification as yet)then be
approved then be purchased by purchase dept then be installed or fixed by the
server teams, server team may want to take the system down which means more
dollars much more.These are just things off my head, when we acutually inititate
the purchase process , there will be definitly some more steps and processes.
Coming back to my question, am i right in saying that we can't partion an
existing table without copy the data to some temp location,Let me know if there
is any other way.
Thanks
Followup:
justification
you pay me alot
disk is cheap.
think about it.
(sometimes it really can be that simple)
we are talking about a couple of gig (like 1 to 3 -- small, tiny -- my /tmp has
more room then that) here, something my LAPTOP would not have an issue with.
in order to partition -- think about it -- you'll need the SOURCE DATA and the
PARTITIONED DATA -- at the same time, for a period of time. no magic there.
GOTO a page to Bookmark Review | Bottom | Top
slightly confused ! July 30, 2003
Reviewer: Srikanth Adiga from Pune, India.
Thanks, Stu and rest for your updates.
Taking this buit more further, since I am confused :)
[Sri]
"Note that, I have to update one subscriber at a time since the number of txns
made by each subscriber if different."
[Sri]
[Stu Charlton]
That still seems doable in 1 SQL statement, even if you wanted to merge the
updates. Couldn't you just do a COUNT(*) grouped by client_id, year, and month?
Followup:
I agree -- most things CAN be done in a single sql statement.
[Stu Charlton]
Sorry, how would one do this?
If my table has two rows;
clientid =1,year=july,year=2003,quantity=10
clientid =2,year=july,year=2003,quantity=20
Now I have to update (clientid=1)'s quantity by 15 and (clientid=2)'s quantity
by 25.
How would you manage this in a single SQL? Like this there would be 5M rows to
be updated.
Btw, my table is indexed on clientid,year and month.
Followup:
what is the OTHER table you are updating from? you must have a detail table
elsewhere from where you derive that 15 and 25. So, assuming something like
this:
ops$tkyte@ORA920> create table t ( clientid int, month int, year int, quantity
int );
Table created.
ops$tkyte@ORA920> create table txns ( clientid int, month int, year int );
Table created.
ops$tkyte@ORA920>
ops$tkyte@ORA920> insert into t values ( 1, 7, 2003, 10 );
1 row created.
ops$tkyte@ORA920> insert into t values ( 2, 7, 2003, 20 );
1 row created.
ops$tkyte@ORA920>
ops$tkyte@ORA920> insert into txns select 1, 7, 2003 from all_objects where
rownum <= 15;
15 rows created.
ops$tkyte@ORA920> insert into txns select 2, 7, 2003 from all_objects where
rownum <= 25;
25 rows created.
ops$tkyte@ORA920>
ops$tkyte@ORA920>
ops$tkyte@ORA920> select * from t;
CLIENTID MONTH YEAR QUANTITY
---------- ---------- ---------- ----------
1 7 2003 10
2 7 2003 20
ops$tkyte@ORA920> update t
2 set quantity = quantity + ( select count(*)
3 from txns
4 where txns.clientid = t.clientId
5 and txns.month = t.month
6 and txns.year = t.year );
2 rows updated.
ops$tkyte@ORA920> select * from t;
CLIENTID MONTH YEAR QUANTITY
---------- ---------- ---------- ----------
1 7 2003 25
2 7 2003 45
there you go.
(it is a shame you are using the wrong datatypes -- only a DATE should be used
to hold, well, DATES. Using numbers to hold a year and a month isn't a good
practice)
GOTO a page to Bookmark Review | Bottom | Top
SCN,CHECK POINT July 30, 2003
Reviewer: AATIF from PAKISTAN
i ask about whta is difference SCN system commit number,systemt change number
and check point.
Which base oracle sync the database SCN or CHECKPOINT ,
when check point ocur who number write in control file headre,datafile and log
file header.
dear tom.
your web site say me not ask more question now but i need clear my point.
AATIF.
Followup:
er?
i remember you asking this on another page -- yes.
and I also remember answering this on that other page -- yes.
I fail to see how this relates to this page.
anyway there IS no difference between
scn
system commit number
system change number
they are the same.
as for the rest of the comment -- it doesn't "make sense" (i do suggest a read
through of the server concepts guide, many questions are answered there!)
GOTO a page to Bookmark Review | Bottom | Top
More clarification August 01, 2003
Reviewer: Srikanth Adiga from Pune,India
>what is the OTHER table you are updating from? you must
>have a detail table elsewhere from where you derive that
>15 and 25.
It is coming from an OCI application after processing some files.
Followup:
use bulk processing - update between 100 and 500 rows at a time in a single call
using array processing.
Or, put the data where it belongs - not in files, in tables -- and process the
data in the database rather then in C
GOTO a page to Bookmark Review | Bottom | Top
August 06, 2003
Reviewer: Srikanth Adiga from Pune, India
>what is the OTHER table you are updating from? you must
>have a detail table elsewhere from where you derive that
>15 and 25.
It is coming from an OCI application after processing some files.
>Followup:
>use bulk processing - update between 100 and 500 rows at a >time in a single
call using array processing.
Yes, that is what we do. It takes about 60 mins to update 5M records on a 2CPU
machine. Is this much expected?
If I do the same in multiple threads I do not see any performance improvement
i.e. 1M records parallely updated in 5 threads.
Any idea why?
Followup:
are you sure they are going in parallel and not blocking/locking each other.
verify that all 5 sessions are in fact not only ACTIVE (v$session) at the same
time but that you are not blocking yourself.
and you have taken some statspacks during your testing to see if you have any
obvious bottlenecks right?
GOTO a page to Bookmark Review | Bottom | Top
Explicit commit ! August 07, 2003
Reviewer: Nathan from London
Tom,
Why do we commit explicitly after submitting jobs ?
Thanks
Nathan
/*------------------------------------------------
Followup:
instead of
begin
execute immediate 'alter index idx1 rebuild';
execute immediate 'alter index idx2 rebuild';
end;
you can code:
declare
l_job number;
begin
dbms_job.submit( l_job, 'execute immediate ''alter index idx1 rebuild'';' );
commit;
dbms_job.submit( l_job, 'execute immediate ''alter index idx2 rebuild'';' );
commit;
end;
---------------------------------*/
Followup:
because if you do not YOUR session can "see" the job in the job queue but the
job queue processes cannot!
How many of us have sat there for minutes going "when is the stupid thing going
to run" and until we exit sqlplus it does nothing :)
GOTO a page to Bookmark Review | Bottom | Top
Explicit Commit August 07, 2003
Reviewer: Nathan from London
Apologies,
I must withdraw my previous question... i was obviously not thinking straight. I
was thinking about a similar sitaution where a commit was issued ( before
dbms_job.submit ) with a comment /* do not remove */ without any
explanations... I'm still pondering why the commit should be there.
Sorry for the trouble.
Regards
Nathan
GOTO a page to Bookmark Review | Bottom | Top
I have to take issue.... August 07, 2003
Reviewer: cheaper than disk from USA
Your response about justification for buying disk space is outdated...
"justification
you pay me alot
disk is cheap.
think about it.
(sometimes it really can be that simple)"
Anybody else out there NOT paid alot ANYMORE?!
;-)
GOTO a page to Bookmark Review | Bottom | Top
Re: Explicit Commit - To Nathan August 07, 2003
Reviewer: A reader
Tom is out till weekend..
"I'm still pondering why the commit should be there. "
The commit after dbms_job is requierd so that other sessions can see the
job_queue...which typically is the
requirement...As usual, Tom can confirm..
Menon:)
GOTO a page to Bookmark Review | Bottom | Top
Reader August 10, 2003
Reviewer: A reader
"
create table invtmp as
select * from invins
UNION ALL
select t1.*
from inv t1, invdel t2
where t1.key = t2.key(+)
and t2.key is null;
"
Like to know why
"
where t1.key = t2.key(+)
and t2.key is null
"
better than
where t1.key != t2.key with ALWAYS_ANTI_JOIN hint
Followup:
because there is no "always anti join" hint?
you could of course use the CBO, and a NOT IN subquery in which case it'll do it
itself.
GOTO a page to Bookmark Review | Bottom | Top
August 14, 2003
Reviewer: Su Baba
So if I have a big table, x, that needs to be updated, I would
create table y nologging as select * from x...
retrieve index definition from x
build indexes on y
drop table x
rename y to x
If I have all these done in a PL/SQL batch job, what's the best way to retrieve
index definition and form create index statement from table x? I want to create
indexes on y that have the exact storage parameter, tablespace, maybe partition,
etc. as the ones on table x.
By the way, the database version is 8.1.7.
thanks
Followup:
you COULD, not WOULD, it is an option.
you don't have your create indexes somewhere?
guess I would exp rows=n, imp indexfile=foo to get them -- or
DBMS_METADATA.get_ddl in 9i
GOTO a page to Bookmark Review | Bottom | Top
How to update part of a Very Large Table August 28, 2003
Reviewer: Mohan K from Bangalore, India
please refer "How to update part of a Very Large Table July 28, 2003"
When updating a large table in a database having comparatively less buffer cache
one thing to make sure is that blocks are updated in a continuous order. Suppose
some records of block1 is updated first in the beginning and next after updating
some other blocks, then block1 will be written twice by DBWR and LGWR. After the
first update block1 may get flushed out by LRU algorithm. If all records in
block1 is updated at a time then block1 is wriiten by LGWR and DBWR only once.
If the update statement is doing a full table scan then bloacks are updated in
order. But the order can be different when the is using an index for selecting
the rows.
The SQL statement that makes sure of that is
update (select id, description from test_update where rowid in(select rowid from
(select rowid from test_update order by rowid))) set description=description||'
new val' where description like '%5%';
Followup:
not necessarily so -- first lgwr doesn't write blocks -- it writes redo.
second, dbwr may or may not write the block more then once.
I would never never never use your approach. why I would want to
a) read the entire table
b) sort the rowids
c) distinct that set (implied)
d) join to it (and you are making the assumption the table in the IN will be the
driving table and that a hash join will not be used and ..... lots of other
stuff)
when a simple:
update T set d = d || 'x' where d like '%5%'
will do it -- if you are worried about the index on d being used - you would
simply FULL hint it -- but doubtful that it would be necessary.
the additional work you add would more then offset the work you are trying to
save.
GOTO a page to Bookmark Review | Bottom | Top
update large table August 28, 2003
Reviewer: Vera from D.C.
Hi Tom and thank you again and again for all your work.
My problem is that I cannot simply replace update with creating a new table. The
table I need to update - I'll call it 'pension table' - has about 1.5 mln rows.
I have to update "amount" fields for 1994, 1995,1995 etc. years, and the way how
I update these fields depends on status of the participant at each particular
year. To determine this status I built a function that looks through another
tables and takes several input parameneters including each particular financial
year because it determines not just status of a participant, but status he/she
had in particular FY. (This function I call a "box of cockroaches"). Basically,
I ask Oracle to bring me annual monthly amount for Mr/Mrs 'X' for, let's say,
1994 and if 'X' had status 'retired' in 1994 then I want this amount to be
calculated one way, if 'X' was 'deferred' -another way and if 'X' was 'dead'
third way.
As you alredy guessed, updates take long time - about an hour for each year. I
wonder what approach would you use in a situation like this?
Thank you
By the way, what is a status of your new book?
Followup:
I would use a CASE statement in SQL without using PLSQL to compute the value.
optionally -- i would turn it inside out. instead of sql calling plsql, this
might be one of the times when plsql calling sql is more appropriate. build up
procedurally an array of values to update -- forall update them -- and do it
again until the rows are processed. with a little foresight, you can even
parallelize this processing and run your procedure N times -- each instance of
it running against a different slice of the table...
book is "real soon" -- should be in trucks right now.
GOTO a page to Bookmark Review | Bottom | Top
update large table August 28, 2003
Reviewer: Luc Gyselinck from Belgium
For large tables, if I have to update more than 20 percent of the rows, I do it
like Tom: write a query to insert the 'updated' records in a new table, using
the /*+ APPEND */ hint. You have no indexes to update, no constraints to
validate and you generate almost no redo. Next, I truncate the old table, after
I have disabled all constraints and triggers (once again, almost no redo, very
fast), and then I inject, again with the /*+ APPEND */ hint, the data from the
new table into the old (very fast, little redo). Indexes get automaticaly
rebuild AFTER the extents are populated. I reenable the triggers, the
constraints (with or without validation, as you wish).
If the new values for the columns being updated must come from other tables, I
NEVER write something like
update t1
set c1 = (select c2 from t2 where t2.c3 = t1.c4)
In fact, you are performing NESTED LOOPs. Using functions to get the new values
for the columns is much the same : NESTED LOOPS (even worse: SQL / PLSQL engine
context switches, open/close cursors in PL/SQL, NO read consistency).
Whenever I find myself in such a situation (typicaly when writing batch
procedures, during data migrations, data transformations), I make sure my
queries use HASH joins, I give the session more resources (higher
SORT_AREA_SIZE, higher HASH_AREA_SIZE, DB_MULTIBLOCK_READ_COUNT to the max),
avoid the INDEX RANGE SCANS, do FULL (yes FULL) table scans, do FAST FULL INDEX
scans, thus bypassing the buffer pool which otherwise gets flushed by the data
from my queries. And by the way, use LMTs with uniform size extents.
As an example, I rewrote a batch procedure that took 3 days to complete (once,
many years ago), written the old way (see the update statement above), that now
does the same job in only 4 hours, on the same (old) hardware, same database,
but different Oracle version: Oracle 7 / Oracle 8i.
GOTO a page to Bookmark Review | Bottom | Top
Yes, you are right, August 28, 2003
Reviewer: Vera from D.C.
but right now I don't see how can I perform my task without update statement and
a function. In other cases I do exactly the same thing you do on large tables -
create new table rather than update old one and then rename tables.
As for LM tablespaces I was begging our DBAs to switch on LMT since I don't know
when and they keep promising to do it.
GOTO a page to Bookmark Review | Bottom | Top
Does parallel insert just imply direct? September 08, 2003
Reviewer: Dennis from Missouri, USA
Tom (or anyone),
Does a parallel insert imply direct (append) insert? I read something in the
documentation (or is that forbidden territory) ;) awhile back about the append
hint not being needed on a parallel insert because it was implied.
I issued the following:
INSERT /*+ append, nologging, parallel(ext, 8) */ INTO
hrdmadm.ext_xpayd_pay_details ext
SELECT /*+ parallel(ext2, 8) */ *
FROM hrdmadm.ext_xpayd_pay_details@hip2 ext2
and while I saw the parallel processes spawn on hip2 (for the select), I didn't
notice any spawn where I was.
I was wondering if that was because parallel insert was synonymous with direct
insert, or did I mess up the syntax somehow? Is the insert in fact parallel,
and it's just 'hidden' from us?
Thanks,
Dennis
Followup:
http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96524/c21dlins.htm#10629
quote:
Serial and Parallel Direct-Path INSERT
When you are inserting in parallel DML mode, direct-path INSERT is the
default. In order to run in parallel DML mode, the following requirements must
be met:
......
GOTO a page to Bookmark Review | Bottom | Top
Thanks! September 08, 2003
Reviewer: Dennis from Missouri, USA
My bad was that I didn't alter session. Thanks Tom for your valuable insight.
I guess I was just too afraid to read down a few more pages when I was in the
documentation, with that being forbidden territory and all ;) Well maybe not
forbidden...just hard to put that 'haunted/fear-inspiring' word that would fit
in there.
Thanks,
Dennis
GOTO a page to Bookmark Review | Bottom | Top
Optimizing Query September 11, 2003
Reviewer: Mohan
BULK collect and BULK insert can improve the performance. Create SQL objects of
type t1%rowtype and t2%rowtype.
BULK collect 100 records at a time to the collection array. Create seperate
collection arrays to store values that are computed or obtained after searching
the database.
Do BULK insert/update into the target table. Search this site using key words
"bulk insert" or "bulk collect" or "bulk update" for more examples.
Other things are
1. Improve query performance by proper indexing
2. Remove fragmentation by "alter table move" command
3. Rebuild indexes
4. Allow large pctfree if the update causes row migration/row chaining
5. Size rollback segments and rollback tablespaces
Mohan
Followup:
#2 isn't really "a good thing" in general.
#3 is really not "a good thing" in general.
reorgs -- so so overrated.
GOTO a page to Bookmark Review | Bottom | Top
September 11, 2003
Reviewer: Kailash from NJ
Can you explain which of the 3 operations will be faster. If I want to update
millions of rows,
1. then would delete/reinsert be faster or
2. mere update will be faster
3. the one you suggested will be faster.
Can you advise as to why the method you had suggested will be faster than 1 and
2.
Can you explain why updating millions of rows is not a good idea. Is it based on
the amount of redo generated and what if the columns I update are not part of
the index?
Which operation will be the fastest and I request you to explain the various DML
operations and the amount of redo they generate. ie. I would appreciate if you
could explain me which DML generates most redo, followed by the second followed
by the 3rd which generates the least redo.
Currently we have an application that checks for a condition in a table A and
obtains the unique id. if a record is found, then flushes all of the dependent
tables for the given id and re-inserts these records back into the dependent
tables with a set of new values. Is this a good approach? If so, what would be
the ideal approach?
Followup:
I can say yes to all three -- each is the fastest.
each is the slowest.
If you have a large bulk update that is done once and hits most of the of rows
-- you may be best off by doing a CTAS, drop old and rename new.
the CTAS, indexing, etc can be done in nologging mode -- no undo, no redo. you
end up getting done much faster.
GOTO a page to Bookmark Review | Bottom | Top
updating millions of records October 03, 2003
Reviewer: b from austin TX
Hi,
The approach of updating the view instead of a table works great.
But here i want to update multiple colums of the same table based on different
table queries.
we have around 600,000 records in the base table.
what could be the most inexpensive query.
We run oralce 9i.
UPDATE user1.UserSummary
Set
Column1 = (
SELECT Count(DISTINCT WebOrderID)
FROM user1.OrderTotal ot
WHERE total>0 and
(ot.ostatus <> -1 and
ot.ostatus <> -1) and
ot.UREFITEM = user1.UserSummary.UREFITEM)),
Column2 = (
SELECT NVL(Sum(Total),0) FROM
user1.OrderTotal ot
WHERE (ot.ostatus <> -1 and ot.ostatus <> -1)
and ot.UREFITEM = user1.UserSummary.UREFITEM),
Column3 = (
SELECT min(sodate) FROM orders.salesorders so left join
user1.ordertotal ot on ot.sorefitem = so.sorefitem
WHERE total>0 and (so.ostatus <> -1 and so.ostatus <> -1)
and so.UREFITEM = user1.UserSummary.UREFITEM),
Column4 = (
SELECT max(sodate) FROM orders.salesorders so
left join user1.ordertotal ot on ot.sorefitem = so.sorefitem
WHERE total>0 and (so.ostatus <> -1 and so.ostatus <> -1) and
so.UREFITEM = user1.UserSummary.UREFITEM)
;
the above is the sample of my query
thanks in advance
;;)
Followup:
if you can come up with a single query that pulls all 4 columns together in a
single result set -- we can optimize this. Left as an exercise for you -- see
if you cannot merge the above 4 separate query into a single query
GOTO a page to Bookmark Review | Bottom | Top
great October 17, 2003
Reviewer: b from tx
thanks
i finally made my query simple
GOTO a page to Bookmark Review | Bottom | Top
Great October 21, 2003
Reviewer: Raj from San Diego CA USA
Great technique never thought of it.
GOTO a page to Bookmark Review | Bottom | Top
October 22, 2003
Reviewer: Anthony Reddy from Malaysia, KL
GOTO a page to Bookmark Review | Bottom | Top
Updating 1 million records October 22, 2003
Reviewer: Kailash from NJ, USA
I have a situation where I have a unique key on 3 columns (c1,c2,c3) on a table
T and I want to update c2 from A to B for some records that match certain
criteria. There are a total of 30 million records in the table and there will be
about 1 million records in the table that match the criteria to be updated. The
table is not partitioned.
1. Your suggestion to drop the old table and to re-create a new table and
re-naming it to the old table may not be possible because I am updating only a
subset of the table. Is there any other method to acheive this ?
2. Which one of the following will be a better solution?
a. Drop index and update the rows and recreate the
index
b. Delete the rows with C2 = A and insert new rows
with C2 = B. My DBA says that this will create
fragmentation and is not a better approach. Is
that true?
c. Do you suggest any better solution under these
scenarios since I keep getting these type of
questions from my manager often.
Your suggestion in this regard is highly appreciated.
Followup:
have you considered just "update set c2 = b where ...."?
1,000,000 rows out of 30,000,000 isn't alot. if c2 is indexed, you might
benefit from a set unusable, skip unusable, update, rebuild.
but it won't take that long to update 1,000,000 rows. couple of minutes on my
laptop
big_table@ORA920> update big_table set created = created+1;
1833792 rows updated.
Elapsed: 00:02:13.28
not a big deal.
GOTO a page to Bookmark Review | Bottom | Top
How about rollback segment problem when updating million rows. October 22, 2003
Reviewer: Ramasamy from NC USA
Tom ,
Thanks lot for your service.
I have to update a single column NUMBER(9) where
the table has got more than 80 columns.
This table has got 30 million rows and i have to update
around 17 million rows each night.
As you always insisted ,
i tried with a single update like
UPDATE t1 SET batchno = (A constant )
WHERE batchno is NULL;
1.)I came across the rollback segment issue.
2.) It is taking around 3 hrs to update.
Your input is greatly appreciated to solve the above issues.
Thanking You,
Ramasamy
Followup:
there is no problem -- other then you haven't sized your system to do the things
you require it to do!!!!!!!!
1) the rollback segment "issue" is a "non issue". size things for what you do
and there are NO ISSUES. disk is cheap -- compared to what they pay us all to
have this discussion. consider that. the amount of time you've spent thinking
about this already cost your employer much more then the disk it would take to
do this correctly.
2) you are changing a null to a non-null -- increasing the width of the field.
make sure you are not causing massive row migration (check your pctfree, make
sure it is set appropriately here). is batchno "indexed"?
GOTO a page to Bookmark Review | Bottom | Top
MILLION RECORD INSERT/SELECT/UPDATE TAKES TOO MUCH TIME October 23, 2003
Reviewer: Santosh Das from Bangalore, INDIA.
Hi Tom,
I have a table A and soon after creating it 1million records are put into it.
Then it being in a real time environment its size goes on increasing atleast
half million each day.However each day a cleanup application cleans half million
every day.After inserting 1 million at the creation of table its put in the
realtime environment.When ever a new record is inserted a contuniously listening
application detects it and applies a select operation on table A. Then it send
the selected items to some table B and then updates the table A.
The listening operation may be 3 or 4 in number.How ever the select and update
operations are taking a lot of time and the initial insertion of 1 million
records is taking 11 hours to complete so what can be the possible architecture.
I tried sending the rowid to a queue/pipe after inserting in table A and then
used it to select and update.But still its taking a lot of time. Plz suggest
some method.
Thanks and regards,
Sandy..
Followup:
sounds like you should just be using AQ -- plop the record in as a message --
have as many dequeuers as you like.
not sure what else to say, not a ton of detail here.
GOTO a page to Bookmark Review | Bottom | Top
Select and update of a table Having Millions of Records. October 23, 2003
Reviewer: Jack from NY
Hi Tom,
I have the following queries which execute on a table of million records.That
table has a PK but its not used in both the queries given below.CMTIME,SAD and
STATUS all are NUMBER fields.If I make index on cmtime then it takes longer.
Presently i dont have a index field.Can you plz tellme how i can optimize these
or where the performance bottleneck is ?
update SD set CMTIME = 1064929834 where (CMTIME is NULL or 1064929834 - CMTIME
>= 10) and (1064929834 - SAD >= 0) and 1064929834 not in (select CMTIME from SD
where CMTIME is not NULL)
SELECT * from SD where (CMTIME is not NULL and (CMTIME = 1064929834 OR
1064929834 - CMTIME >= 10)) AND (ED - 1064929834 > 0 and (STATUS = 1 OR STATUS =
0) and 1064929834 - SAD >= 0)
Thanks and regards,
Jack.
Followup:
are you using the CBO.
GOTO a page to Bookmark Review | Bottom | Top
Followup to your question October 23, 2003
Reviewer: Ramasamy from NC USA
2) you are changing a null to a non-null -- increasing the width of the field.
make sure you are not causing massive row migration (check your pctfree, make
sure it is set appropriately here). is batchno "indexed"?
Yes Tom. The Batchno is indexed like
index on (BATCHno,STATUS);
Will it be worth Droping the index and recreating it?.
Thanks,
Ramasamy
Followup:
could definitely help, yes.
GOTO a page to Bookmark Review | Bottom | Top
October 23, 2003
Reviewer: George Lee from Hong Kong
I tried to update 30 million records using Tom's solution.
It works fine. Using approximately 1.5 hours to finish.
George (Broadway Photo Supply Ltd. HK)
GOTO a page to Bookmark Review | Bottom | Top
whats CBO October 24, 2003
Reviewer: Jack from NY
Followup:
cost based optimizer
GOTO a page to Bookmark Review | Bottom | Top
No October 27, 2003
Reviewer: jack from NY
We are not using any cost based optimizers.
It's a simple query which we are using to retrieve data from a table having
millions of records and it's performance is not satisfactory.
How to optimize it so that it runs faster ?
jack
Followup:
you are kidding right?
this goes right up there with "my car won't start, why not?"
there is so much missing from this "request" as to make it impossible to say
anything sensible.
GOTO a page to Bookmark Review | Bottom | Top
November 11, 2003
Reviewer: George Lee from Hong Kong
I have a fact table being partitioned by month. The indexes are built local to
partition.
The data population is working fine.
But the rebuild index fails at the end.
I don't know how to rebuild indexes in the partitions.
Can you help me ?
Followup:
why do you need to rebuild here at all??
but why don't you show us what you are doing -- at the very least -- define what
"fails at the end" means (error codes, messages, details)
GOTO a page to Bookmark Review | Bottom | Top
November 12, 2003
Reviewer: George Lee from Hong Kong
Dear Tom,
Yes, I should provide enough information to you. Sorry for that.
Here is the story,
I create a table as following,
drop table D_DYNRPT_SALES_FACT;
create table D_DYNRPT_SALES_FACT
(
TX_DATE DATE,
SHOP_NO VARCHAR2(20),
ITEM_NO VARCHAR2(12),
PARENT_ITEM_NO VARCHAR2(12),
BRAND_NO VARCHAR2(5),
VENDOR_NO VARCHAR2(20),
ITEM_GROUP VARCHAR2(5),
ITEM_TYPE VARCHAR2(5),
CONSIGNMENT VARCHAR2(1),
ITEM_DESC VARCHAR2(40),
BRAND_DESC VARCHAR2(30),
VENDOR_DESC VARCHAR2(30),
CATEGORY_DESC VARCHAR2(30),
QTY_SOLD NUMBER(14,2),
NET_SALES_AMT NUMBER(14,2),
GROSS_PROFIT_AMT NUMBER(14,2)
)
PARTITION BY RANGE (TX_DATE)
( PARTITION D_DYNRPT_SALES_FACT_2000_09 VALUES LESS THAN
(TO_DATE('01/10/2000','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2000_09,
PARTITION D_DYNRPT_SALES_FACT_2000_10 VALUES LESS THAN
(TO_DATE('01/11/2000','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2000_10,
PARTITION D_DYNRPT_SALES_FACT_2000_11 VALUES LESS THAN
(TO_DATE('01/12/2000','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2000_11,
PARTITION D_DYNRPT_SALES_FACT_2000_12 VALUES LESS THAN
(TO_DATE('01/01/2001','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2000_12,
PARTITION D_DYNRPT_SALES_FACT_2001_01 VALUES LESS THAN
(TO_DATE('01/02/2001','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2001_01,
PARTITION D_DYNRPT_SALES_FACT_2001_02 VALUES LESS THAN
(TO_DATE('01/03/2001','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2001_02,
PARTITION D_DYNRPT_SALES_FACT_2001_03 VALUES LESS THAN
(TO_DATE('01/04/2001','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2001_03,
PARTITION D_DYNRPT_SALES_FACT_2001_04 VALUES LESS THAN
(TO_DATE('01/05/2001','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2001_04,
PARTITION D_DYNRPT_SALES_FACT_2001_05 VALUES LESS THAN
(TO_DATE('01/06/2001','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2001_05,
PARTITION D_DYNRPT_SALES_FACT_2001_06 VALUES LESS THAN
(TO_DATE('01/07/2001','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2001_06,
PARTITION D_DYNRPT_SALES_FACT_2001_07 VALUES LESS THAN
(TO_DATE('01/08/2001','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2001_07,
PARTITION D_DYNRPT_SALES_FACT_2001_08 VALUES LESS THAN
(TO_DATE('01/09/2001','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2001_08,
PARTITION D_DYNRPT_SALES_FACT_2001_09 VALUES LESS THAN
(TO_DATE('01/10/2001','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2001_09,
PARTITION D_DYNRPT_SALES_FACT_2001_10 VALUES LESS THAN
(TO_DATE('01/11/2001','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2001_10,
PARTITION D_DYNRPT_SALES_FACT_2001_11 VALUES LESS THAN
(TO_DATE('01/12/2001','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2001_11,
PARTITION D_DYNRPT_SALES_FACT_2001_12 VALUES LESS THAN
(TO_DATE('01/01/2002','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2001_12,
PARTITION D_DYNRPT_SALES_FACT_2002_01 VALUES LESS THAN
(TO_DATE('01/02/2002','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2002_01,
PARTITION D_DYNRPT_SALES_FACT_2002_02 VALUES LESS THAN
(TO_DATE('01/03/2002','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2002_02,
PARTITION D_DYNRPT_SALES_FACT_2002_03 VALUES LESS THAN
(TO_DATE('01/04/2002','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2002_03,
PARTITION D_DYNRPT_SALES_FACT_2002_04 VALUES LESS THAN
(TO_DATE('01/05/2002','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2002_04,
.
.
.
PARTITION D_DYNRPT_SALES_FACT_2004_12 VALUES LESS THAN
(TO_DATE('01/01/2005','DD/MM/YYYY'))
TABLESPACE D_DYNRPT_SALES_FACT_2004_12,
PARTITION D_DYNRPT_SALES_FACT_MAXVALUE VALUES LESS THAN (maxvalue)
TABLESPACE D_DYNRPT_SALES_FACT_MAXVALUE
)
/
CREATE INDEX D_DYNRPT_SALES_FACT_I01 ON D_DYNRPT_SALES_FACT
(TX_DATE) LOCAL;
CREATE INDEX D_DYNRPT_SALES_FACT_I02 ON D_DYNRPT_SALES_FACT
(TX_DATE, ITEM_GROUP, ITEM_TYPE) LOCAL;
CREATE INDEX D_DYNRPT_SALES_FACT_I03 ON D_DYNRPT_SALES_FACT
(TX_DATE, SHOP_NO, ITEM_NO, ITEM_GROUP) LOCAL;
CREATE INDEX D_DYNRPT_SALES_FACT_I04 ON D_DYNRPT_SALES_FACT
(TX_DATE, BRAND_NO, ITEM_GROUP, ITEM_TYPE) LOCAL;
CREATE INDEX D_DYNRPT_SALES_FACT_I05 ON D_DYNRPT_SALES_FACT
(TX_DATE, ITEM_NO) LOCAL;
CREATE OR REPLACE PUBLIC SYNONYM D_DYNRPT_SALES_FACT FOR
BPSADM.D_DYNRPT_SALES_FACT;
then, populate data into the table as the following,
alter index D_DYNRPT_SALES_FACT_I01 unusable;
alter index D_DYNRPT_SALES_FACT_I02 unusable;
alter index D_DYNRPT_SALES_FACT_I03 unusable;
alter index D_DYNRPT_SALES_FACT_I04 unusable;
alter index D_DYNRPT_SALES_FACT_I05 unusable;
alter session set skip_unusable_indexes=true;
@pop_d_dynrpt_sales_fact.sql '2000/09/01' '2000/01/30';
.
.
.
@pop_d_dynrpt_sales_fact.sql '2003/11/01' '2003/11/30';
alter index D_DYNRPT_SALES_FACT_I01 rebuild nologging;
alter index D_DYNRPT_SALES_FACT_I02 rebuild nologging;
alter index D_DYNRPT_SALES_FACT_I03 rebuild nologging;
alter index D_DYNRPT_SALES_FACT_I04 rebuild nologging;
alter index D_DYNRPT_SALES_FACT_I05 rebuild nologging;
the data_population is working fine, but alter index rebuild nologging fail with
the error as following,
alter index D_DYNRPT_SALES_FACT_I01 rebuild nologging
*
ERROR at line 1:
ORA-14086: a partitioned index may not be rebuilt as a whole
alter index D_DYNRPT_SALES_FACT_I02 rebuild nologging
*
ERROR at line 1:
ORA-14086: a partitioned index may not be rebuilt as a whole
alter index D_DYNRPT_SALES_FACT_I03 rebuild nologging
*
ERROR at line 1:
ORA-14086: a partitioned index may not be rebuilt as a whole
alter index D_DYNRPT_SALES_FACT_I04 rebuild nologging
*
ERROR at line 1:
ORA-14086: a partitioned index may not be rebuilt as a whole
alter index D_DYNRPT_SALES_FACT_I05 rebuild nologging
*
ERROR at line 1:
ORA-14086: a partitioned index may not be rebuilt as a whole
I checked the reference, it should be something syntax error on my rebuild index
statement, it is like alter index D_DYNRPT_SALES_FACT_I05 rebuild partition...,
I don't know the exact statement, can you help me, thanks.
Followup:
ops$tkyte@ORA920LAP> CREATE TABLE t
2 (
3 data char(255),
4 temp_date date
5 )
6 PARTITION BY RANGE (temp_date) (
7 PARTITION part1 VALUES LESS THAN (to_date('13-mar-2003','dd-mon-yyyy')) ,
8 PARTITION part2 VALUES LESS THAN (to_date('14-mar-2003','dd-mon-yyyy')) ,
9 PARTITION part3 VALUES LESS THAN (to_date('15-mar-2003','dd-mon-yyyy')) ,
10 PARTITION part4 VALUES LESS THAN (to_date('16-mar-2003','dd-mon-yyyy')) ,
11 PARTITION part5 VALUES LESS THAN (to_date('17-mar-2003','dd-mon-yyyy')) ,
12 PARTITION part6 VALUES LESS THAN (to_date('18-mar-2003','dd-mon-yyyy')) ,
13 PARTITION junk VALUES LESS THAN (MAXVALUE)
14 )
15 ;
Table created.
ops$tkyte@ORA920LAP>
ops$tkyte@ORA920LAP> create index t_idx1 on t(temp_date) LOCAL nologging;
Index created.
ops$tkyte@ORA920LAP>
ops$tkyte@ORA920LAP> alter index t_idx1 unusable;
Index altered.
ops$tkyte@ORA920LAP>
ops$tkyte@ORA920LAP> begin
2 for x in ( select 'alter index ' || index_name ||
3 ' rebuild partition ' || partition_name stmt
4 from user_ind_partitions
5 where index_name = 'T_IDX1' )
6 loop
7 dbms_output.put_line( x.stmt );
8 execute immediate x.stmt;
9 end loop;
10 end;
11 /
alter index T_IDX1 rebuild partition PART1
alter index T_IDX1 rebuild partition PART2
alter index T_IDX1 rebuild partition PART3
alter index T_IDX1 rebuild partition PART4
alter index T_IDX1 rebuild partition PART5
alter index T_IDX1 rebuild partition PART6
alter index T_IDX1 rebuild partition JUNK
PL/SQL procedure successfully completed.
GOTO a page to Bookmark Review | Bottom | Top
December 11, 2003
Reviewer: A reader
Hi Tom,
I am running below update statement from past 24 hr and it's still running..
Table S_REVN has 1.5 Million and Table S_OPTY_PROD has 1.2 Million Record.. It's
doing full table scan of both table sinace I am using function in where clause..
We using RULE base optimizer and oracle 8.1.7.4
Here is sql statement
Update siebel.S_REVN r
set x_opty_prod_city=(select X_CITY from siebel.S_OPTY_PROD p where
replace(replace(replace(p.ROW_ID,'ce','='),'-','='), '@', '$') = r.row_id),
X_opty_prod_STATE =(select X_STATE from siebel.S_OPTY_PROD p where
replace(replace(replace(p.ROW_ID,'ce','='),'-','='), '@', '$') = r.row_id),
X_PR_YEAR_QTY =(select X_PR_YEAR_QTY from siebel.S_OPTY_PROD p where
replace(replace(replace(p.ROW_ID,'ce','='),'-','='), '@', '$') = r.row_id)
0 UPDATE STATEMENT Optimizer=RULE
1 0 UPDATE OF 'S_REVN'
2 1 TABLE ACCESS (FULL) OF 'S_REVN'
3 0 TABLE ACCESS (FULL) OF 'S_OPTY_PROD'
4 0 TABLE ACCESS (FULL) OF 'S_OPTY_PROD'
5 0 TABLE ACCESS (FULL) OF 'S_OPTY_PROD'
Please advice
Thanks
Followup:
you understand that it is doing 3 full scans of S_OPTY_PROD for EACH and EVERY
row in S_REAVN yes???
how about
update t
set (a,b,c) = (select a,b,c from t2 where ... )
at the very least (reduce the full scans from 3 to 1)
at the very most, consider creating an index:
create index
temp_index_to_make_this_complete_before_my_grandchildren_have_grandchildren
on p(replace(replace(replace(p.ROW_ID,'ce','='),'-','='), '@', '$'))
and useing a hint (to let sighbul use the cbo) to get the index used.
GOTO a page to Bookmark Review | Bottom | Top
December 12, 2003
Reviewer: A reader
Hi Tom,
I am following your advice on update I've changed sql statement
Update /*+ INDEX(S_OPTY_PROD T_TEMP) */ siebel.S_REVN r
set (x_opty_prod_city,X_opty_prod_STATE,X_PR_YEAR_QTY) =
(select X_CITY,X_STATE,X_PR_YEAR_QTY from siebel.S_OPTY_PROD p
where replace(replace(replace(p.ROW_ID,'ce','='),'-','='), '@', '$') = r.row_id)
created function based index on S_OPTY_PROD
create index T_TEMP
on S_OPTY_PROD(replace(replace(replace(ROW_ID,'ce','='),'-','='), '@', '$'))
BUT above sql statement in not using index eventhough I supply hint (Is it b/c
function based index not reconized by Rule based optimizer?)
Please help
Thanks
Followup:
you put the hint in the wrong place :)
and you might not have met all of the requirements for FBI's. read:
http://asktom.oracle.com/~tkyte/article1/index.html
but it is not the RBO (the well formed hint -> cbo)....
your hint is wrong (must use the correlation name of P) and in the wrong place
(should be in the subquery).
you need query_rewrite_enabled and (depending on version) might need
query_rewrite_integrity (see the above link).
but here is an example:
ops$tkyte@ORA920> create table t1( x_opty_prod_city int, x_opty_prod_state int,
x_pr_year_qty int, row_id varchar2(25) );
Table created.
ops$tkyte@ORA920> create table t2( x_city int, x_state int, x_pr_year_qty int,
row_id varchar2(25) );
Table created.
ops$tkyte@ORA920>
ops$tkyte@ORA920> create index t2_idx on t2(
replace(replace(replace(ROW_ID,'ce','='),'-','='), '@', '$') );
Index created.
ops$tkyte@ORA920>
ops$tkyte@ORA920> delete from plan_table;
4 rows deleted.
ops$tkyte@ORA920>
ops$tkyte@ORA920> ALTER SESSION SET QUERY_REWRITE_ENABLED=true;
Session altered.
ops$tkyte@ORA920>
ops$tkyte@ORA920> explain plan for
2 Update t1 r
3 set (x_opty_prod_city,X_opty_prod_STATE,X_PR_YEAR_QTY) =
4 (select /*+ INDEX( t2 t2_idx ) */ X_CITY,X_STATE,X_PR_YEAR_QTY
5 from t2 p
6 where replace(replace(replace(p.ROW_ID,'ce','='),'-','='), '@', '$')
= r.row_id)
7 /
Explained.
ops$tkyte@ORA920>
ops$tkyte@ORA920> prompt @?/rdbms/admin/utlxpls
@?/rdbms/admin/utlxpls
ops$tkyte@ORA920> set echo off
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------
--------------------------------------------------
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |Cost(%CPU)|
-------------------------------------------------------------------------------
| 0 | UPDATE STATEMENT | | 82 | 4346 | 3 (34)|
| 1 | UPDATE | T1 | | | |
| 2 | TABLE ACCESS FULL | T1 | 82 | 4346 | 3 (34)|
| 3 | TABLE ACCESS BY INDEX ROWID| T2 | 1 | 53 | 2 (50)|
|* 4 | INDEX RANGE SCAN | T2_IDX | 1 | | 2 (50)|
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 -
access(REPLACE(REPLACE(REPLACE("P"."ROW_ID",'ce','='),'-','='),'@','$')=:B
1)
16 rows selected.
GOTO a page to Bookmark Review | Bottom | Top
December 12, 2003
Reviewer: A reader
hello Tom,I create table as following using the parallel hint,but the cost is
different:
sql>insert /*+ append parallel(demo01,4) */ into demo01 select /*+
parallel(customer,4) */ * from customer;
the cost --42s
and the Execution Plan is:
Execution Plan
----------------------------------------------------------
ERROR:
ORA-12838:
无法在并行模式下修改&
#20043;后读/修改对象
SP2-0612: 生成AUTOTRACE
EXPLAIN报告时出现错误
Statistics
----------------------------------------------------------
1698 recursive calls
455 db block gets
8728 consistent gets
8075 physical reads
66492440 redo size
630 bytes sent via SQL*Net to client
525 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
11 sorts (memory)
0 sorts (disk)
390126 rows processed
sql>insert * from demo02 select * from customer;
cost --26s
and the Execution Plan is:
Execution Plan
----------------------------------------------------------
0 INSERT STATEMENT Optimizer=CHOOSE (Cost=1226 Card=659568 Byt
es=680674176)
1 0 PARTITION RANGE (ALL)
2 1 TABLE ACCESS (FULL) OF 'CUSTOMER' (Cost=1226 Card=659568
Bytes=680674176)
Statistics
----------------------------------------------------------
1939 recursive calls
44876 db block gets
20918 consistent gets
1 physical reads
64954576 redo size
646 bytes sent via SQL*Net to client
467 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
7 sorts (memory)
0 sorts (disk)
390126 rows processed
The customer table have 0.4 million records and the machine is 2 CPU's SUN
E420,which runiing Oracle database.
Can you tell me why the physical reads increase so much using parallel hint in
the DML sql,thanks
Followup:
well, for a tiny number of records like this, i would not even consider PQ at
this point.
But -- to answer your questions
the cost, well the cost is affected by thousands of things. HINTS definitely
affect the cost (That is in part how hints work -- by playing with costs).
Ignore the cost of two queries that are not the same, they are NOT comparable.
the PIO's -- well, you ran one before the other? That and parallel query
prefers to checkpoint -- and do direct io many times (you might find PQ doing a
global checkpoint before the query begins to get current images onto disk in
order to just all hit slam the data as fast as it can without messing with the
buffer cache)
but for so few records -- it would seem that regular sql is what you want.
GOTO a page to Bookmark Review | Bottom | Top
December 13, 2003
Reviewer: A reader
Hi Tom,
Followed you advice...WORKS LIKE A CHARM..updated 1.5 Million recod in 10 Min.
EXCELLENT!!!!!!!
Thanks a LOTTTTTTTTTTT
Update siebel.S_REVN r
set (x_opty_prod_city,X_opty_prod_STATE,X_PR_YEAR_QTY) =
(select /*+ INDEX(S_OPTY_PROD T_TEMP) */ X_CITY,X_STATE,X_PR_YEAR_QTY from
siebel.S_OPTY_PROD p
where replace(replace(replace(p.ROW_ID,'ce','='),'-','='), '@', '$') = r.row_id)
GOTO a page to Bookmark Review | Bottom | Top
Online updation of very big table March 06, 2004
Reviewer: Praveen from Delhi
Hi Tom
Iwant to update a table with one go on online system.A table has 20 lakh records
and has 110 columns. when i give the update command it takes ~ one hrs.I don't
know why it is taking so much time even index is created on that particular
field.
select count(*),app_flg from test;
app_flag has only two values Approved or unapproved.By defualt is unapproved.
approved =170000
unapproved=30000
update test set app_flg='APPROVED' where app_flg='UNAPPROVED'
it took 1hrs to update the records and other application online users processing
got slow down and locks started to occure on the table.
Suggest the best solution
pkr
Followup:
It would be the INDEX that slowed it down.
You want to FULL SCAN this table.
You want there to be NO index on the column you are updating.
GOTO a page to Bookmark Review | Bottom | Top
it gives error March 07, 2004
Reviewer: praveen from india
on dropping the index it gives error ora-00054
thanks
Followup:
that'll happen whilst there are outstanding transactions, yes. You'll want to
keep trying until you get a chance to drop it.
GOTO a page to Bookmark Review | Bottom | Top
thanks March 07, 2004
Reviewer: jasdeep,praveen from patiala india
i have solved that problem as an user has locked rows on the but at present not
logged on i killed that session and index was dropped immediately.
and updates were as fast as u can think.
thanks
GOTO a page to Bookmark Review | Bottom | Top
Update taking hrs of time March 09, 2004
Reviewer: Sachin from India
Hi Tom,
I have a query as:
I have two tables Table1(with around 10000 records max) and table2(actually the
GL code combination master table with around 600,000 records). I need to update
three fields in table1 (which is a temp processing table) with a unique value
from table2. T1 has ccid fields which need to be updated, and s1-s4 fields
corresponding to segment1-segment4 fields of table2
Following is the query like:
UPDATE table1 t1
SET t1.ccid1 = (SELECT t2.code_combination_id
FROM gl_code_combinations t2,
gl_sets_of_books gb
WHERE t2.segment5 = 'yyyyy'
and t2.segment4 = t1.s4
and t2.segment3 = 'XXXXX'
and t2.segment2 = t1.s2
and t2.segment1 = t1.s1
and t2.coa_id = gb.chart_of_accounts_id
and gb.set_of_books_id = 1),
t1.ccid2 = (SELECT t2.code_combination_id
FROM gl_code_combinations t2,
gl_sets_of_books gb
WHERE t2.segment5 = 'yyyyy'
and t2.segment4 = t1.s4
and t2.segment3 = 'zzzzzz'
and t2.segment2 = t1.s2
and t2.segment1 = t1.s1
and t2.coa_id = gb.coa_id
and gb.set_of_books_id = 1),
t1.ccid3 = (SELECT t2.ccid
FROM gl_code_combinations t2,
gl_sets_of_books gb
WHERE t2.segment5 = 'yyyyy'
and t2.segment4 = t1.s4
and t2.segment3 = 'wwwww'
and t2.segment2 = t1.s2
and t2.segment1 = t1.s1
and t2.coa_id = gb.coa_id
and gb.set_of_books_id = 1
But this is taking more than 6 hrs to complete. When I check the table locks the
table remains locked Row Xclusively. I am committing immediately after the
update statement in the procedure. Cud u pls tell me why is this happening?
Followup:
not that i don't believe you but -- about the locking issue -- I'm not sure i
believe you.
when you commit -- locks are released. You'd have to show me the info from
v$lock showing that t1 is still "locked" and locked by your session.
you could
set (c1,c2,c3) = ( select max(decode(segment3,'xxxxx',ccid)),
max(decode(segment3,'zzzzz',ccid)),
max(decode(segment3,'wwwww',ccid))
FROM gl_code_combinations t2,
gl_sets_of_books gb
WHERE t2.segment5 = 'yyyyy'
and t2.segment4 = t1.s4
and t2.segment3 in ('xxxxx','zzzzzz', 'wwwww')
and t2.segment2 = t1.s2
and t2.segment1 = t1.s1
and t2.coa_id = gb.coa_id
and gb.set_of_books_id = 1 )
instead of running a query per column.
if you run the query
select max(decode(segment3,'xxxxx',ccid)),
max(decode(segment3,'zzzzz',ccid)),
max(decode(segment3,'wwwww',ccid))
FROM gl_code_combinations t2,
gl_sets_of_books gb
WHERE t2.segment5 = 'yyyyy'
and t2.segment4 = some_value_value
and t2.segment3 in ('xxxxx','zzzzzz', 'wwwww')
and t2.segment2 = some_value_value
and t2.segment1 = some_value_value
and t2.coa_id = gb.coa_id
and gb.set_of_books_id = 1
how long does that take?
GOTO a page to Bookmark Review | Bottom | Top
Updating millions of rows March 15, 2004
Reviewer: A reader from NJ, USA
Tom,
In this discussion you mention that you would not do an UPDATE/DELETE of
millions of rows, but rather would create a new table (in parallel/nologging)
with the rows that needs to be retained and then create indexes in parallel,
drop the old table and rename the new table to old table. If so, my
understanding is :
1. Delete/UPDATE of tables with more than 1 million row would also have to
maintain the index structure and also will generate considerable rollback and
undo.
2. Other consideration with respect to update/delete is that the blocks will be
moved in and out of the freelists. Please confirm if this is true and if so,
please explain what happens during update/delete of large number of rows with
regard to freelists?
3. Other consideration is that it will cause lot of table/segment fragmentation
and lot of empty blocks below the HWM and all full table scans on the table will
read lot of empty blocks.
Is my understanding correct and if I have missed any other things as to why
DELETE/UPDATE of large tables is not recommended, please let me know.
4. If my table that needs to be updated/deleted is the primary/parent table with
dependent child tables/constraints and triggers, is this approach correct
Disable all the triggers/constraints
Drop the table
Rename the new table to the old table
Reenable all constraints and triggers
What to do in such cases?
Followup:
I "might", "probably" if it was most of the records...
1) correct
2) correct, deletes would put them on, updates might put them on, might take
them off.
3) it could, yes.
the delete/update can take lots longer than CTAS + reindex if you are doing MOST
all of the records in a LARGE table with lots of indexes.
4) triggers are not relevant here....
but you would have to drop the fkeys
do the CTAS+drop+rename
add the constraints back with enable/rely (don't need validation)
GOTO a page to Bookmark Review | Bottom | Top
Deletes/Updates March 15, 2004
Reviewer: A reader from NJ, USA
Tom,
In the above discussion you mention :
1. The deletes will put them on because the used space in the block will drop
below PCTUSED. Is it correct?
2. The update will put them on if the updated column makes the used space in
the block fall below PCTUSED or the updated column makes the free space
available in the block less than PCTFREE. Is it correct.
If both of the above understanding is not correct, please explain. Also, please
let me know how we could track the movement of the blocks on and off the
freelist. Any SQL scripts/tools available that I could use?
Thanks
Followup:
1) the deletes could, yes.
2) yes.
to track individual blocks -- not without dumping blocks (and I don't go there).
dbms_space and other tools (search this site for show_space for the script I
use) can show you how MANY things are on the freelist.
GOTO a page to Bookmark Review | Bottom | Top
A small correction March 15, 2004
Reviewer: A reader from NJ, USA
Tom,
In the above scenario for question 2, there is a small correction marked in
CAPITAL letters :
"The update will put them on if the updated column makes the used space in the
block fall below PCTUSED AND WILL TAKE OFF of the freelist if the updated column
makes the free space available in the block less than PCTFREE. "
Is it correct?
Thanks
Followup:
I'll state it simply, forget pctfree/pctused/etc -- in english:
if the update reduces the row size, it can put it back onto the freelist.
if the update increases the row size, it can take it off the freelist.
GOTO a page to Bookmark Review | Bottom | Top
Parallel DML March 16, 2004
Reviewer: A reader
1)On page 630 of your Oracle Expert One on One, you mention that parallel DML
can be used only upon partitions -- in 8i. I'm using 9.2. Has this behavior
changed in the later versions ?
2)I need to do a mass delete on a 12 GB table and it's taking to much time (far
too much, really). How could I speed it up ?
Thanks
Followup:
1) yes, that is a 9ir2 new feature.
http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96531/ch2_9ir2.htm#74432
2) did you read the original answer above?
GOTO a page to Bookmark Review | Bottom | Top
enable PK with parallel clause March 20, 2004
Reviewer: A reader
Hi
I am working in 8.1.7.4.
I want to delete 30 million rows of 60 million so I am doing this
1. create a copy of original table with good data
2. disable original tables′s constraints, primary keys (and child FKs) and
foreign keys
3. truncate the original table
4. make all original tables indexes unusable
5. insert append back from copy to original, I have to do this way because I am
not allowed to modify constraint names
6. drop the copy
7. enable constraints. here when I enable PK I cannot provide paralle clause
right? I searched the doco but it seems that I can only specify using index
tablespace clause or I am missing something?
8. rebuild all unusable indexes
Followup:
you can create an index in parallel (unique) and then enable/rely/novalidate the
pk.
GOTO a page to Bookmark Review | Bottom | Top
How to Update millions or records in a table March 25, 2004
Reviewer: A reader
Hi Tom,
I read your response to Murali's question above and believe, there will be a
downtime window for the application. This is because if I want to keep the
index names same as before, I will have to create the new table, drop the old
table, rename the new table to old table name and then create the required
indexes on the same.
Wondering, how if we can create the indexes by some other name on the new table
and rename the indexes after dropping the old table.
As always your valuable advice helps a lot!
Followup:
you can rename the indexes.
alter index supports this.
GOTO a page to Bookmark Review | Bottom | Top
how to make this restartble? March 25, 2004
Reviewer: A reader
Hi
I want to delete 80% of rows from a couple of 50 million rows table. The process
is divided in 10 steps
1. Create TEMP table, CTAS from original table
2. Disable constraints
3. Truncate original table
4. Set indexes to UNUSABLE
5. Insert into original table from TEMP table
6. Drop TEMP table
7. Create PK and UK
8. Enable PK and UK
9. Enable FKs
10. Rebuild indexes
I want to make this process restartble, i.e if it fails in step 3 if I later
rerun the procedure it will start from step 3 again. How can we achiveve this? I
have been thinking using a table which stores the two table name and process
status something like following table then update the status as the process is
going
TABLE: PROCESS_STATUS
TABLE_NAME STATUS
----------- ---------------
EMP 2
DEPT 0
Process_status is the step, 0 means that I need to start from beginning so in
each process I would query this table and see which step should be run. For
example if the process failed at step 3 then STATUS would be 2
select status into l_proc_status from PROCESS_STATUS
where table_name = 'EMP'
if l_proc_status = 0
then
run process 1
end if;
select status into l_proc_status from PROCESS_STATUS
where table_name = 'EMP'
if l_proc_status = 1
then
run process 2
end if;
select status into l_proc_status from PROCESS_STATUS
where table_name = 'EMP'
if l_proc_status = 2
then
run process 3
end if;
select status into l_proc_status from PROCESS_STATUS
where table_name = 'EMP'
if l_proc_status = 2
then
run process 3
end if;
and so on.
Any suggestions :-? What I see is that I will hav quite a few redundant codes...
Followup:
you'd have to keep a state table and have you code query it up, much like you
are suggesting. That would do it yes.
another "inserting" idea might be to
a) insert the steps to process
b) delete them as you complete them (and commit)
c) to restart, just pick up at the step you wanted
you could insert the procedures to be called and just:
for x in ( select name from procedures order by seq )
loop
execute immediate 'begin ' || x.name || '; end;';
delete from procedures....
commit;
end loop
(just a thought, not fully baked)
GOTO a page to Bookmark Review | Bottom | Top
provide more information about that procedure drop indexes and rebuild with nologging April 07, 2004
Reviewer: mohan from india
Hi Tom
Could you provide more information about that procedure and
how to drop indexes and rebuild with nologging .We are using informatica etl
tool
before loading bulk data into target drop the indexes ( pre-session) and after
load the data rebuilt the indexes with nologging (post-session ), it takes less
amount of time because its generates less undo
Regards
Mohan
Followup:
it is just a drop and create? not sure what you are looking for -- if it is
"syntax", we document that.
GOTO a page to Bookmark Review | Bottom | Top
Problem with Update April 08, 2004
Reviewer: Ram from Bangalore,India
Hi Tom,
This update is not working properly.How to correct this?
SQL> declare
2 x number;
3 y number;
4 begin
5 execute immediate 'update e set sal = sal*1.1 where ename in
('SMITH','KING') returning sal into :x,:y' using x,y;
6 dbms_output.put_line(x || chr(9)||y);
7* end;
*
ERROR at line 5:
ORA-06550: line 5, column 65:
PLS-00103: Encountered the symbol "SMITH" when expecting one of the following:
. ( * @ % & = - + ; < / > at in is mod not rem return
returning
like between into using || bulk
Please do reply.
Bye!
Followup:
..... ( ''SMITH'', ''KING'' ) .....
you need two quotes for a quote in a string (until 10g when there is another
way)
GOTO a page to Bookmark Review | Bottom | Top
Updating million records April 08, 2004
Reviewer: Himnish Narang from Chandigarh, India
Hi Tom,
i was just going through the discussion in this subject. In this you have
described that you will create a table instead of going for updates of million
records and this update will be peroformed as part of creation of new table.
Don't you think that the new table will also occupy the same amount of space and
after dropping the table we will have to do the reorg for that tablespace.
please comment.
Thanx.
Followup:
reorgs are so overrated.
no, i would not reorge a tablespace or anything simply cause I copied some data
and dropped some old stuff. not a chance.
GOTO a page to Bookmark Review | Bottom | Top
How to Update millions or records in a table April 08, 2004
Reviewer: Ganesh from Visakhapatnam
Hi Tom,
Its very useful and I used this saved lots of time by creating new table.
I got another issue similar to this. We are using 9i. there is requirement of
needs to modify primary key datatype number to varchar which is having lot of
dependents. Is there any option with out rebuilding table as data already exist.
Can you please suggest on this.
Thanks in Advance.
Ganesh
Followup:
that'll be a rebuild pretty much.
GOTO a page to Bookmark Review | Bottom | Top
Please see this April 08, 2004
Reviewer: Ram from Bangalore,India
Hi Tom,
Thanks for your reply but Still it's not working properly.
SQL> declare
2 x number;
3 y number;
4 begin
5 execute immediate 'update e set sal = sal*1.1 where ename in
(''SMITH'',''ALLEN'')
6 RETURNING sal into :x,:y' using x,y;
7 dbms_output.put_line(x ||chr(9) ||y);
8 end;
9 /
declare
*
ERROR at line 1:
ORA-00913: too many values
ORA-06512: at line 5
How to correct this?You say about another way in Oracle 10G.How does that
work?Could you please provide an example.
Please do reply.
Bye!
Followup:
you are returning SAL (one thing) into two things (:x,:y)
different problem all together.
in 10g, the strings could be
q'|how's this for quoting|'
instead of
'how''s this for quoting'
a character string literal that starts with q for quote will use the next two
characters as the start quote and those two characters in reverse for the end
quote.
GOTO a page to Bookmark Review | Bottom | Top
Thanks April 09, 2004
Reviewer: Ram from Bangalore,India
Hi Tom,
Thanks for your reply.I found out the way of doing it as follows and Do you
have any other option to do it in a better way?
SQL> create or replace type sal_t as table of number;
2 /
Type created.
SQL> declare
2 sals sal_t := sal_t();
3 begin
4 execute immediate 'update emp set sal = sal*1.1 where ename in
(''SMITH'',''KING'')
5 returning sal into :sals' returning bulk collect into
sals;
6 for i in 1..sals.count loop
7 dbms_output.put_line('Updated salary:'||sals(i));
8 end loop;
9 end;
10 /
Updated salary:968
Updated salary:5500
PL/SQL procedure successfully completed.
Please do reply.
Bye!
Followup:
get rid of the dynamic sql, it isn't necessary (fixes your original issue with
the quotes as well). also, would seem to make sense to pull back other columns
so you would be able to see what rows were actually updated:
ops$tkyte@ORA9IR2> declare
2 type numArray is table of number;
3 l_sals numArray;
4 l_empno numArray;
5 begin
6 update emp
7 set sal = sal*1.1
8 where ename in ('SMITH','KING')
9 returning empno, sal bulk collect into l_empno, l_sals;
10
11 for i in 1..l_sals.count loop
12 dbms_output.put_line
13 ('Updated salary for empno ' || l_empno(i) ||
14 ' to: '||l_sals(i));
15 end loop;
16 end;
17 /
Updated salary for empno 7369 to: 880
Updated salary for empno 7839 to: 5500
PL/SQL procedure successfully completed.
GOTO a page to Bookmark Review | Bottom | Top
To Mr. Ram April 09, 2004
Reviewer: A reader
Hi Ram and All,
May I request you all not to pester Tom with inane posts requesting Tom to debug
and program on our behalf.
This forum is not to learn coding.
GOTO a page to Bookmark Review | Bottom | Top
update based on rowid April 16, 2004
Reviewer: john
Tom,
we use non-intelligent primary key to update the table through a stored
procedure.
i.e. when the table is queried, our select procedure returns the non-intelligent
key along with the details like below
select emp_key, other details from emp;
when updating the data, emp_key is passed from the front end to the stored
procedure and table is updated based on emp_key.
is it a bad idea, using rowid instead of non-intelligent key(emp_key)?
Followup:
non-intelligent? interesting choice of terminology.
But in any case -- once upon a time ago, rowids were IMMUTABLE. Once assigned a
row would have a rowid assigned to it and that rowid would live with that row
until you deleted it.
Starting in 8i with support for updates to partition keys that would cause a row
to move from partition A to partition B -- that is no longer true (and then
there are IOT's...).
In 10g, there are even more options for 'row movement' -- an online segment
shrink for example.
So, rowids can change and are changing in more circumstances as time goes on.
Sooooo, if you have lost update detection in place using 'FOR UPDATE' -- rowids
are very safe (forms uses them). What I mean is -- you
a) select a.*, rowid from T a where .... 'normally'
b) you decide to update a row, so you
select * from t where rowid = :that_rowid AND
((c1 = :that_c1 or (c1 is null and :that_c1 is null)) AND
((c2 = :that_c2 or (c2 is null and :that_c2 is null)) AND ...
((cN = :that_cN or (cN is null and :that_cN is null))
for update NOWAIT;
you lock it -- this prevents anything that could change a rowid from doing so
and verifies the data hasn't been changed. If that returns 0 rows, someone
changed the data (or reorg'ed it and moved it). You need to requery to get the
current values before you let the user even think about modifying it. If that
returns a row -- you got it. If that returns ora-54, something has it locked,
you have to decide what you want to do about that.
c) you can then safely update that row by rowid
If you do not have lost update detection in place using FOR UPDATE locks -- then
you should stick with a primary key probably, just to protect yourself in the
future.
GOTO a page to Bookmark Review | Bottom | Top
update based on rowid April 19, 2004
Reviewer: john
thanks a lot Tom.
to understand fully what you said, can you please tell
1.why do we have to where condition?
((c1 = :that_c1 or (c1 is null and :that_c1 is null)) AND
((c2 = :that_c2 or (c2 is null and :that_c2 is null)) AND ...
((cN = :that_cN or (cN is null and :that_cN is null))
2.another question is, is there a situation where a rowid of a row being
assigned another row of the same table?(because this may cause more danger as we
end up updating another row)?
thanks again.
Followup:
1) you want to make sure the row hasn't changed.
c1 = :that_c1 will only with if both the column and the bind are not null,
hence the rest is needed to make it so null's compare "the same"
2) this is tied into 1 again. sure, we could delete from emp where empno =
7788; commit; and then re-insert that row and get the same rowid (rare, but
could happen).
But - if we inserted the same exact values and they got the same exact rowid --
then #1 would make this "safe". the values of the row are the same, for all
intents and purposes it IS the same row.
If the newly inserted row doesn't match column by column the values we expect,
well, then we'll reject it (won't lock it) and all will be well.
GOTO a page to Bookmark Review | Bottom | Top
update large table April 22, 2004
Reviewer: Prasad Chittori from US
I have a very large partitioned table with DATE column, I would like to take out
the time portion from the date column
I did the following and it is taking lot of time and failing with unable to
extend the rollback segments.
Any better ideas without increasing the rollback segments or changing the query
update t1
set dt = trunc(dt)
Followup:
parallel dml -- each pq slave can get its own RBS, letting you use them all at
the same time, not just one of them.
GOTO a page to Bookmark Review | Bottom | Top
convert delete to insert April 26, 2004
Reviewer: marvin
Hi
We want to delete several tables of several million of rows. The problem is, the
delete statement is quite simple but if we want to this faster we would insert
to temp table, truncate original and insert back the good data. We want to
convert delete into insert statements but I think it′s very hard
For example
We have these DELETE statements
DELETE
FROM OM_ANNEX t
WHERE EXISTS
(SELECT 1
FROM LISTA_ORDENES l,OM_CONTRACT_INST c
WHERE t.contract_id=c.contract_id
AND c.order_id=l.order_id)
/
COMMIT
/
DELETE
FROM OM_ANNEX t
WHERE EXISTS(SELECT 1 FROM OM_CONTRACT_INST c
WHERE t.contract_id=c.contract_id
AND c.order_id IS NULL)
/
COMMIT
to convert to that we did
INSERT /*+ APPEND */ INTO OM_ANNEX$TMP
SELECT t.*
FROM OM_ANNEX t
WHERE NOT EXISTS (SELECT 1
FROM LISTA_ORDENES l,
OM_CONTRACT_INST c
WHERE t.contract_id=c.contract_id
AND c.order_id=l.order_id)
UNION
SELECT t.*
FROM OM_ANNEX t
WHERE NOT EXISTS (SELECT 1
FROM OM_CONTRACT_INST c
WHERE t.contract_id=c.contract_id
AND c.order_id IS NULL);
Seems incorrect, if I run the queries inside insert seperately I get correct
results but if we use UNIONs we get data which we dont want, if we do it in two
inserts we get duplicate rows
One approach is this
create a temp table using the delete statements but converted to querys for
example instead of
DELETE
FROM OM_ANNEX t
WHERE EXISTS
(SELECT 1
FROM LISTA_ORDENES l,OM_CONTRACT_INST c
WHERE t.contract_id=c.contract_id
AND c.order_id=l.order_id)
we convert to
select t.*
FROM OM_ANNEX t
WHERE EXISTS (SELECT 1
FROM LISTA_ORDENES l,OM_CONTRACT_INST c
WHERE t.contract_id=c.contract_id
AND c.order_id=l.order_id)
then simply says
select * from om_annex a
where not exists (select null from temp_table b
where a.annexid = b.annexid)
annexid is PK
the other approach is this
select * from om_annex a
where annexid not in (select annexid
FROM OM_ANNEX t
WHERE EXISTS (SELECT 1
FROM LISTA_ORDENES l,OM_CONTRACT_INST c
WHERE t.contract_id=c.contract_id
AND c.order_id=l.order_id)
union
select annexid
FROM OM_ANNEX t
WHERE EXISTS(SELECT 1
FROM OM_CONTRACT_INST c
WHERE t.contract_id=c.contract_id
AND c.order_id IS NULL))
They are all very slow because involving tables have several million of rows!
Is there a better approach?
Followup:
why not:
create table new as select rows to keep;
drop table old;
rename new to old;
do the first in parallel, with nologging.
GOTO a page to Bookmark Review | Bottom | Top
err the problem is convert the DELETE to CTAS April 26, 2004
Reviewer: A reader
Hi
create table new as select rows to keep;
drop table old;
rename new to old;
do the first in parallel, with nologging.
That is exactly I want to do, the problem is until now we have always done the
other way round, use plain DELETE (and it takes a week to delete everything!),
the problem I am not sure how to convert DELETE to CTAS. If I want to do the
reverse of DELETE statements (some table has 5 DELETE statements!) it is not as
simple as write the DELETE the other way round? For example how would you change
delete tab1
where exists (select null
from tab2, tab3
where tab2.id = tab3.id
and tab1.id = tab2.fid)
delete tab1
where exists (select null
from tab2
where tab2.fid = tab1.id
and tab2.id is null)
Would you change it to
insert into tmp_x
select *
from tab1
where not exists (select null
from tab2, tab3
where tab2.id = tab3.id
and tab1.id = tab2.fid)
insert into tmp_x
select *
from tab1
where not exists (select null
from tab2
where tab2.fid = tab1.id
and tab2.id is null)
Is simple as this?
Followup:
if i had this:
delete tab1
where exists (select null
from tab2, tab3
where tab2.id = tab3.id
and tab1.id = tab2.fid)
delete tab1
where exists (select null
from tab2
where tab2.fid = tab1.id
and tab2.id is null)
I would probably have this:
create table tab1_new
as
select tab1.*
from tab1, tab2, tab3
where tab1.id = tab2.fid(+)
and tab2.id = tab3.id(+)
and NOT ( tab2.fid is not null and tab3.id is not null )
and NOT ( tab2.fid is not null and tab2.id is null )
/
outer join the three tables. Negate the conditions for the where exists.
that is, after outer joining tab1 to tab2, tab3 -- remove the rows
where tab2.fid is not null and tab3.id is not null -- that is subquery one in
your deletes above.
where tab2.fid is not null and tab2.id is null -- that is subquery two in your
deletes above.
GOTO a page to Bookmark Review | Bottom | Top
err the problem is convert the DELETE to CTAS April 26, 2004
Reviewer: marvin
Hi
create table new as select rows to keep;
drop table old;
rename new to old;
do the first in parallel, with nologging.
That is exactly I want to do, the problem is until now we have always done the
other way round, use plain DELETE (and it takes a week to delete everything!),
the problem I am not sure how to convert DELETE to CTAS. If I want to do the
reverse of DELETE statements (some table has 5 DELETE statements!) it is not as
simple as write the DELETE the other way round? For example how would you change
delete tab1
where exists (select null
from tab2, tab3
where tab2.id = tab3.id
and tab1.id = tab2.fid)
delete tab1
where exists (select null
from tab2
where tab2.fid = tab1.id
and tab2.id is null)
Would you change it to
insert into tmp_x
select *
from tab1
where not exists (select null
from tab2, tab3
where tab2.id = tab3.id
and tab1.id = tab2.fid)
insert into tmp_x
select *
from tab1
where not exists (select null
from tab2
where tab2.fid = tab1.id
and tab2.id is null)
Is simple as this?
GOTO a page to Bookmark Review | Bottom | Top
thank you very much for the outer join tip April 27, 2004
Reviewer: marvin
Hi
I am going to have a look how to apply the outer join in order to convert DELETE
to CTAS.
I have a further question, I have another table which undergoes 4 DELETEs
DELETE
FROM SW_PERSON t
WHERE EXISTS
(SELECT 1
FROM LISTA_ORDENES o
WHERE o.order_id=t.swobjectid AND t.swtype='ORDER')
/
COMMIT
/
DELETE
FROM SW_PERSON t
WHERE t.swtype='ORDER' AND t.swobjectid IS NULL AND COMP_INST_ID IS NULL
/
COMMIT
/
DELETE
FROM SW_PERSON t
WHERE t.swtype IS NULL
/
COMMIT
/
DELETE
FROM SW_PERSON t
WHERE t.swtype='ORDER'
AND t.swobjectid IS NULL
AND EXISTS
(SELECT 1
FROM OM_COMPANY_INST c, LISTA_ORDENES l
WHERE c.COMP_INST_ID=t.COMP_INST_ID
AND l.order_id=c.order_id)
/
COMMIT
/
I need to convert this to CTAS as well, however I am not sure if this can be
done in a single statement. These DELETE for example cant be converted into one
as follows right? (because of commit between them)
DELETE
FROM SW_PERSON t
WHERE EXISTS (SELECT 1
FROM LISTA_ORDENES o
WHERE o.order_id=t.swobjectid
AND t.swtype='ORDER')
OR (t.swtype='ORDER'
AND t.swobjectid IS NULL
AND COMP_INST_ID IS NULL)
OR t.swtype IS NULL
OR (t.swtype='ORDER'
AND t.swobjectid IS NULL
AND EXISTS (SELECT 1
FROM OM_COMPANY_INST c,
LISTA_ORDENES l
WHERE c.COMP_INST_ID=t.COMP_INST_ID
AND l.order_id=c.order_id));
Can this use the outer join tip as well?
TIA
Followup:
why have any commits in between.
but of course -- any four deletes against a single table can (and if you ask me,
should) be done as a single delete.
the outer join was used in the CTAS, not in a delete.
GOTO a page to Bookmark Review | Bottom | Top
why do you use outer join April 27, 2004
Reviewer: A reader
hi
Why is outer join needed for tab1, tab2 and tab3 :-?
create table tab1_new
as
select tab1.*
from tab1, tab2, tab3
where tab1.id = tab2.fid(+)
and tab2.id = tab3.id(+)
and NOT ( tab2.fid is not null and tab3.id is not null )
and NOT ( tab2.fid is not null and tab2.id is null )
/
Followup:
because we wanted to keep all rows in tab1 -- if there is NO mate in tab2/tab3
since they deleted "where exists in tab2/tab3":
delete tab1
where exists (select null
from tab2, tab3
where tab2.id = tab3.id
and tab1.id = tab2.fid)
delete tab1
where exists (select null
from tab2
where tab2.fid = tab1.id
and tab2.id is null)
rows in tab2/tab3, we need to ALWAYS keep rows that are NOT in tab2/tab3 using
CTAS. outer join is mandatory for that in this example.
GOTO a page to Bookmark Review | Bottom | Top
to marvin April 27, 2004
Reviewer: A reader
hi marvin, you can try this
select * from tab1
where PK not in (select PK
from tab1
where exists (select null
from tab2, tab3
where tab2.id = tab3.id
and tab1.id = tab2.fid)
union
select *
from tab1
where exists (select null
from tab2
where tab2.fid = tab1.id
and tab2.id is null)))
GOTO a page to Bookmark Review | Bottom | Top
regarding the conditions April 27, 2004
Reviewer: A reader
Hi Tom
May you show some light why
NOT ( tab2.fid is not null and tab3.id is not null )
is same as
exists (select null
from tab2, tab3
where tab2.id = tab3.id
and tab1.id = tab2.fid)
and
NOT ( tab2.fid is not null and tab2.id is null )
is same as
exists (select null
from tab2
where tab2.fid = tab1.id
and tab2.id is null)
Cant see why. Thank you
Followup:
it isn't the same.
it is in fact the opposite.
if you outer join T1 to T2 to T3
and before you were looking for (where exists)
a) a match in T2 (tab1.id = tab2.fid)
b) a match in T3 for that T2 (tabe2.id = tab3.id)
then you are saying "if I outer join T1 to T2 to T3, that row would be such
that:
a) tab2.fid is NOT NULL (we found a mate)
b) tab3.id is NOT NULL (we found a mate in t3 for t2)
with the where exists -- we would have deleted that row, hence with the CTAS
(which is finding rows to keep) we simply NEGATE that (with NOT).
Therefore we would keep that row, IF that was "not" satisfied.
Same logic for the second part.
the second where exists says
delete the row if
a) there is a match in T2 (where tab2.fid = tab1.id)
b) the id column in t2 for that match is NULL
that would be (in an outer join)
(tab2.fid is not null) -- we joined to a row
(tab2.id is null) -- and that row is having id is null
negate it and keep it.
GOTO a page to Bookmark Review | Bottom | Top
Updating Table having millions of records taking lot of Time.. April 28, 2004
Reviewer: Anand Pandey from INDIA
Hi Tom,
I hd a table having millions of record in which two of its cols are Null.
i just tried to update the null colls with the data from other table, which is
taking taking around 10-12 hrs for a single day record, and I've to update it
for 31 days.
pls help Me in getting the high perf. on updation.
Query used is:
UPDATE /*+nologging parallel(4) */ MASTER_tab A SET
(null_col1,null_col2)=(SELECT /*+PARALLEL(4) */ MIN(C1),MIN(C2)
FROM Table2 B WHERE SUBSTR (A.col1_T1, 1, LENGTH (B.C1)) = B.C1
AND SUBSTR(A.col2_T1,1,3)=B.C2)
WHERE c_date='01-Jan-04'
Thanks and Regards,
Anand Pandey
GOTO a page to Bookmark Review | Bottom | Top
Nologging - how does it impact recovery? April 30, 2004
Reviewer: Naresh from Cyprus
Hi Tom,
This is a great chain of discussion. I especially liked the "outer join to
replace the not exists". I am really looking forward to my copy of your first
book that I oredered recently (on it's way from amazon).
One question regarding making the table nologging: Does it not have implications
for recovery? What am I missing?
Followup:
you need to schedule a hot backup if you use non-logged operations, yes.
GOTO a page to Bookmark Review | Bottom | Top
db sequential waits on UPDATE May 14, 2004
Reviewer: A reader
Dear Tom,
Due to some processing complexities I am unable to use a single SQL for
Update/Insert.
I use:
LOOP
1. Bulk select (500 rows at a time) from Table A, C with rowids from C
2. Bulk insert
3. Bulk update table C
END LOOP
I am getting a very high number of "db file sequential read" waits on the update
part. Please see relevant portions of Trace File:
********************************************************************************
SELECT /*+ FULL(v) FULL(c) PARALLEL(v,4) PARALLEL(c,4) HASH(v,c) */
c.rowid,
...
FROM vp v,citi c
WHERE v.idno= c.idno
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.03 0.07 10 234 0 0
Execute 1 0.32 3.38 3023 156 3 0
Fetch 155 1.96 82.82 0 0 0 77500
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 157 2.32 86.28 3033 390 3 77500
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 95 (recursive depth: 1)
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
db file sequential read 4 0.01 0.03
rdbms ipc reply 26 0.00 0.00
enqueue 2 0.02 0.02
PX Deq: Join ACK 7 0.00 0.00
PX Deq: Parse Reply 4 0.23 0.35
PX Deq: Execute Reply 717 2.06 735.45
PX qref latch 4 1.02 2.75
PX Deq: Table Q Normal 5218 0.35 23.95
********************************************************************************
INSERT /*+ APPEND PARALLEL(A,4) */ INTO t_v A
VALUES(:b1,
:b2)
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.01 2 11 0 0
Execute 155 0.32 8.00 552 1116 5567 77500
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 156 0.32 8.02 554 1127 5567 77500
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 95 (recursive depth: 1)
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
db file sequential read 552 0.36 6.43
free buffer waits 6 1.02 1.23
********************************************************************************
UPDATE /* PARALLEL(c,4) */ citi c
SET ...
WHERE c.rowid = :b6
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 1 0 0
Execute 154 35.26 586.87 75251 77036 79595 77000
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 155 35.26 586.87 75251 77037 79595 77000
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 95 (recursive depth: 1)
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
db file sequential read 75578 1.04 461.09
free buffer waits 242 1.03 96.11
********************************************************************************
Is it not true that if the row has already been selected (and is there in the
buffer cache) no disk IO should happen to update this row?
Please tell me a way to make this faster.
Regards
Followup:
insert /*+ append parallel... */ VALUES hmmmmmm
append = no go with values, only works with "as select"
parallel -- parallel on what exactly?
update /* parallel(c,4) */ hmmmmmm
in search of the fast = true switch. thankfully that hint is malformed and hence
ignored as well -- you are updating by rowid.
parallel is *not* fast=true.
append is not either really, especially with the values clause.
you can review the trace file itself (p1,p2,p3 will tell you file and block
info, along with blocks read). you can use that to verify that it is the very
act of reading the indexes that need to be updated and maintained that is
causing this. If they are not in the case, well, we'll need to read them into
there.
GOTO a page to Bookmark Review | Bottom | Top
More Info May 16, 2004
Reviewer: A reader
Thanks Tom,
The hint in the update was there by a Developer, it has been rightly disabled.
The insert is not a bottleneck so didn't look into it.
This is a development machine, there are no other jobs running, asynchronous IO
is enabled, the machine is on RAID 0 (no fault tolerance - being a development
one). There are NO INDEXES on Tables being inserted and updated.
The segment on which the WAITS (db sequential read) are happening are that of
the TABLE that is being UPDATED.
Please guide me next.
Followup:
then you are seeing physical IO performed to read into the buffer cache the data
needed to be updated. If it is not cached, we'll have to read it.
GOTO a page to Bookmark Review | Bottom | Top
PARALLEL DML May 28, 2004
Reviewer: Sar from NJ, USA
Tom,
I need to update a table that has 50 million rows but
the number of rows affected are only 1 million rows.
I have a single update statement to do this.
UPDATE acc_proc
SET acc_seq_num = 0
WHERE acc_seq_num > 0
/
The column acc_seq_num is not part of an index.
Can you please suggest me if there is anything better
that I can do on this SQL to run it faster.
Thanks
Sar
Followup:
nope, that is perfect.
shouldn't take very long at all. unless you are getting blocked constantly by
other sessions.
GOTO a page to Bookmark Review | Bottom | Top
do i need to commit after execute immediate ? for dml or ddl ? June 02, 2004
Reviewer: A reader
http://download-east.oracle.com/docs/cd/A87861_01/NT817EE/index.htm
do i need to commit after execute immediate ? for dml or ddl ? or it dose
automatically ?
I tried to find the answer but not found in doc. from above site please help
Followup:
DDL is processed *conceptually as:
begin
commit; -- commit anything you started, even if DDL fails, you are committed
do the ddl;
commit; -- commit the ddl
exception
when others
then
rollback; -- the ddl
raise;
end;
/
You need not commit DDL, it'll commit 2 times for you.
You need to commit DML as DML does not do that.
This is regardless of the method used to execute SQL, native dynamic sql,
dbms_sql, static sql, whatever -- the rules are the same in all cases.
GOTO a page to Bookmark Review | Bottom | Top
Thanks !! June 02, 2004
Reviewer: A reader
ok,
so i did
1.)
begin
begin
Execute immediate 'SET CONSTRAINTS ALL DEFERRED';
end;
begin
execute immediate 'delete from table ' || t;
end;
begin
execute immediate 'delete from table ' || t1;
end;
exception
when others then
rollback;
raise;
end;
/
i got the message some constraint of t1 violated. (child record exists)
if i did not issue the commit,post or rollback,
and did not perform any DDL,constraints are DEFERRED
what is the issue ? can u help ?
Followup:
...
begin
execute immediate 'set constraints all deferred';
execute immediate 'delete from ' || t;
execute immediate 'delete from ' || t1;
execute immediate 'set constraints all immediate';
end;
/
that is all the code you want -- the all immediate will validate the contraints
if they fail -- then the error (exception) will be propagated back to the client
and the rollback of the statement (which is whatever the client sent over to the
database -- a plsql block, whatever) will be undone.
GOTO a page to Bookmark Review | Bottom | Top
June 02, 2004
Reviewer: A reader
...
begin
execute immediate 'set constraints all deferred';
execute immediate 'delete from ' || t; -- ***** this one
-- execute immediate 'delete from ' || t1;
-- execute immediate 'set constraints all immediate';
end;
/
the question is
as soon as it executes the
'delete from ' || t;
statement, it raises the referential constriant violation exception. why ?
as
1.) constraints all deferred
2.) it NOT a ddl
3.) I am not commiting(or ending the tx) manually
Followup:
the constraint must not be deferrable.
you have to have deferrable constraints in order to defer them. the default is
"not deferrable"
GOTO a page to Bookmark Review | Bottom | Top
7.3.4 - extracting 5 million records from 100 million record table June 15, 2004
Reviewer: A reader
Any suggestions on how to accomplish this on 7.3.4? No concept of nologging,
partitioning, Export/query clause etc in 7.3.4..
How I can extract about 5 million records (based on a condition) into a separate
database/server from a 100million record table? What's the most efficient way?
I'm having to make do with very limited disk/space on both machines.
Thanks.
Followup:
5 million is pretty small.
I'd just
create table as select ( concept of nologging did in fact exist, unrecoverable);
export it
import it
or use the sqlplus copy command if they are connected via a nice network.
GOTO a page to Bookmark Review | Bottom | Top
June 16, 2004
Reviewer: A reader
Thanks. Had some long columns in there. So 'Create table as select' wouldn't
work. Am just ending up plsql looping and hitting smaller chunks of the mega
table. Then creating smaller target tables..then exp/imp.
Am afraid I won't have sufficient temp space to do a sqlplus "Copy from". Also,
documentation talks about Copy not being intended for Oracle to Oracle DBs. No
idea why.
Followup:
copy is ONLY for oracle to oracle db's....
GOTO a page to Bookmark Review | Bottom | Top
June 16, 2004
Reviewer: A reader
Per 7.3.4 documentation :
http://download-west.oracle.com/docs/pdf/A42562_1.pdf
Excerpt from it -
Note: In general, the COPY command was designed to be used
for copying data between Oracle and non-Oracle databases. You should use SQL
commands (CREATE TABLE AS and INSERT)
to copy data between Oracle databases.
Followup:
but if you think about it..... doesn't matter what the doc says, it takes two
connections. connections are only to oracle databases. sure you could be using
a gateway -- but even there, it would be true that create table as and insert
would work.
GOTO a page to Bookmark Review | Bottom | Top
Updating and Inserting 1 Million rows Daily with Bitmap indexes June 24, 2004
Reviewer: Sidda from Santa Clara,USA
Hi Tom,
Here we are facing very big problem. We have a Partitioned table with 180Million
records(with 70 columns and 10 bitmaps,10 B-tree indexes).End users need Bitmap
indexes 24/7 so we can't drop and recreate table and indexes.Daily we have to
update and insert 1Million records. We tried with bulk updates but invain. What
is the best method to follow up?
Thanks in advance
Sidda
Followup:
describe "in vain", what went wrong?
it would be best to do a SINGLE insert into and a SINGLE Update against this
table (not even in bulk - just single statements)
GOTO a page to Bookmark Review | Bottom | Top
Creating Table with aggregated data from another table July 26, 2004
Reviewer: RB from NJ
Tom: Qn related to creating a table with data from another table -
I want to create a table with few fields and aggregated some of few columns from
another table.
Here is a query:
create table SRC200401 as select ID, src, LOC, sum(BYTES) TOTAL, sum
(billable_bytes) TOTAL_BILLABLE from SRC_MONTHLY where ID = 1072915200 group by
ID,LOC, SRC;
I have 128Million record in the SRC_MONTHLY table and it is taking more than 1
hr to get this table created. I have index on ID, src and LOC fields.
Any faster way of getting this table created?
Followup:
so how many records have that id?
GOTO a page to Bookmark Review | Bottom | Top
Great aproach, but is it as fast for an IOT table? July 26, 2004
Reviewer: Peter Tran from Houston, TX USA
Hi Tom,
I'm trying the same approach with an IOT table. We have an IOT table
partitioned daily. I want to recreate this table with a monthly partition. I
do a CTAS parallel nologging using the new monthly partition, but it's SLOW.
Then again, the table does have 200 million rows. Is the "Index Organization"
part of table the slow part?
Thanks,
-Peter
Followup:
what do the inputs/outputs look like (table structure) -- remember, you are
sorting 200,000,000 rows as well!
did you give it a nice juicy sort area size?
GOTO a page to Bookmark Review | Bottom | Top
Unfortunately no. July 26, 2004
Reviewer: Peter Tran from Houston, TX USA
"...nice juicy sort area size" That would be a negative. :( Man, this is
going to take forever isn't it? Ugh...
SQL> show parameter sort_area
NAME TYPE VALUE
------------------------------------ ----------- --------
sort_area_retained_size integer 8388608
sort_area_size integer 8388608
SQL> desc odfrc;
Name Null? Type
----------------------------------------------------- -------- ------------
ODIFID NOT NULL NUMBER(10)
TRPORGN NOT NULL VARCHAR2(5)
TRPDSTN NOT NULL VARCHAR2(5)
POSCOUNTRYCODE NOT NULL VARCHAR2(3)
PAXTYPE NOT NULL VARCHAR2(1)
DCP NOT NULL NUMBER(2)
ODIFDATE NOT NULL DATE
FRCDATE NOT NULL DATE
BKGMEAN NUMBER
BKGMEANINFLUENCED NUMBER
BKGVARIANCE NUMBER
XXLMEAN NUMBER
XXLMEANINFLUENCED NUMBER
XXLVARIANCE NUMBER
Here's my CTAS:
CREATE TABLE ODFRC_MONTHLY (
ODIFID,
TRPORGN,
TRPDSTN,
POSCOUNTRYCODE,
PAXTYPE,
DCP,
ODIFDATE,
FRCDATE,
BKGMEAN,
BKGMEANINFLUENCED,
BKGVARIANCE,
XXLMEAN,
XXLMEANINFLUENCED,
XXLVARIANCE,
CONSTRAINT ODFRC_MONTHLY_PK PRIMARY KEY
(ODIFID, ODIFDATE, TRPORGN, TRPDSTN,POSCOUNTRYCODE, PAXTYPE, DCP, FRCDATE)
) ORGANIZATION INDEX nologging parallel 8
PARTITION BY RANGE (ODIFDATE)
(PARTITION ODFRC_20021130 VALUES LESS THAN (TO_DATE('2002-12-01',
'YYYY-MM-DD'))
TABLESPACE ODFRC_IDX01 ,
PARTITION ODFRC_20021231 VALUES LESS THAN (TO_DATE('2003-01-01',
'YYYY-MM-DD'))
TABLESPACE ODFRC_IDX02 ,
PARTITION ODFRC_20030131 VALUES LESS THAN (TO_DATE('2003-02-01',
'YYYY-MM-DD'))
TABLESPACE ODFRC_IDX03 ,
PARTITION ODFRC_20030228 VALUES LESS THAN (TO_DATE('2003-03-01',
'YYYY-MM-DD'))
TABLESPACE ODFRC_IDX04 ,
PARTITION ODFRC_20030331 VALUES LESS THAN (TO_DATE('2003-04-01',
'YYYY-MM-DD'))
TABLESPACE ODFRC_IDX05 ,
PARTITION ODFRC_20030430 VALUES LESS THAN (TO_DATE('2003-05-01',
'YYYY-MM-DD'))
TABLESPACE ODFRC_IDX06 ,
PARTITION ODFRC_20030531 VALUES LESS THAN (TO_DATE('2003-06-01',
'YYYY-MM-DD'))
TABLESPACE ODFRC_IDX07 ,
PARTITION ODFRC_20030630 VALUES LESS THAN (TO_DATE('2003-07-01',
'YYYY-MM-DD'))
TABLESPACE ODFRC_IDX08 ,
PARTITION ODFRC_20030731 VALUES LESS THAN (TO_DATE('2003-08-01',
'YYYY-MM-DD'))
TABLESPACE ODFRC_IDX09 ,
PARTITION ODFRC_20030831 VALUES LESS THAN (TO_DATE('2003-09-01',
'YYYY-MM-DD'))
TABLESPACE ODFRC_IDX10 ,
PARTITION ODFRC_20030930 VALUES LESS THAN (TO_DATE('2003-10-01',
'YYYY-MM-DD'))
TABLESPACE ODFRC_IDX11 ,
PARTITION ODFRC_20031031 VALUES LESS THAN (TO_DATE('2003-11-01',
'YYYY-MM-DD'))
TABLESPACE ODFRC_IDX12 ,
PARTITION ODFRC_20031131 VALUES LESS THAN (TO_DATE('2003-12-01',
'YYYY-MM-DD'))
TABLESPACE ODFRC_IDX13) AS SELECT
ODIFID,
TRPORGN,
TRPDSTN,
POSCOUNTRYCODE,
PAXTYPE,
DCP,
ODIFDATE,
FRCDATE,
BKGMEAN,
BKGMEANINFLUENCED,
BKGVARIANCE,
XXLMEAN,
XXLMEANINFLUENCED,
XXLVARIANCE
FROM odfrc nologging;
The original ODFRC table has daily partition for one year. Anyway, I can
estimate how long this will take?
-Peter
Followup:
the nologging is in the wrong place
from odfrc nologging;
is the same thing as:
from odfrc a;
would you consider
a) create the table (empty, alter it to be nologging)
b) insert /*+ append */ into the individual partitions (instead of all at once)
in parallel sessions?
GOTO a page to Bookmark Review | Bottom | Top
Create Table with data from an aggregated sum of few fields from another table July 26, 2004
Reviewer: RB from NJ
Tom: Qn related to creating a table with data from another table -
I want to create a table with few fields and aggregated some of few columns from
another table.
Here is a query:
create table SRC200401 as select ID, src, LOC, sum(BYTES) TOTAL, sum
(billable_bytes) TOTAL_BILLABLE from SRC_MONTHLY where ID = 1072915200 group by
ID,LOC, SRC;
I have 128Million record in the SRC_MONTHLY table and it is taking more than 1
hr to get this table created. I have index on ID, src and LOC fields.
Any faster way of getting this table created?
Followup:
so how many records have that id?
RB: Tom - This number varies - we have so many IDs in the master table. If I
pass one id then the query will have one equi join with that ID if more than one
I was planning to use IN clause. So I do not know how many records per id I will
have it in the table at any given point of time.
Followup:
now I'm confused -- the predicate is variant? why create *that* table then?
what is the goal here -- to make queries of the form:
select ID, src, LOC, sum(BYTES) TOTAL, sum
(billable_bytes) TOTAL_BILLABLE from SRC_MONTHLY where ID = :x group by id, loc,
src;
faster in general? (you don't have to duplicate lots of text, it is all right
here)
GOTO a page to Bookmark Review | Bottom | Top
Great suggestion! July 26, 2004
Reviewer: Peter Tran from Houston, TX USA
Hi Tom,
Thanks for the useful suggestion.
1) Part one done (easy).
2) Is this what you mean for part 2? When you say parallel sessions, do you
mean kick off a bunch of them using execute immediate?
INSERT /*+ append */ INTO odfrc_monthly(
ODIFID,TRPORGN,TRPDSTN,POSCOUNTRYCODE,
PAXTYPE,DCP,ODIFDATE,FRCDATE,BKGMEAN,
BKGMEANINFLUENCED,BKGVARIANCE,XXLMEAN,
XXLMEANINFLUENCED,XXLVARIANCE)
SELECT
ODIFID,TRPORGN,TRPDSTN,POSCOUNTRYCODE,
PAXTYPE,DCP,ODIFDATE,FRCDATE,BKGMEAN,
BKGMEANINFLUENCED,BKGVARIANCE,XXLMEAN,
XXLMEANINFLUENCED,XXLVARIANCE
FROM odfrc partition(ODFRC_20021114) nologging;
3) Should I still give it a nice juicy sort_area_size? :)
Thanks,
-Peter
Followup:
2) fire up N sqlplus sessions and run the insert append in each
3) sure, let each session have a big SAS -- alter session can be used.
GOTO a page to Bookmark Review | Bottom | Top
Create Table with data from an aggregated sum of few fields from another table July 26, 2004
Reviewer: RB from NJ
Tom - If the user can select one or more ids. If I have more than one ID then I
was planning to use an IN clause in the where clause. The temp table that I am
creating will be used in a later phase of the app for other joins. What I am
looking for a soln which will be must fater than my current approach. The Query
that I have given with a 120M table is taking more than 1 hr to create the
aggregated table.
Followup:
what is the query plan in general then.
you would not create a temporary table in oracle -- that would be so sqlserver.
just use that query in the "IN" statment in the first place!!!!!
GOTO a page to Bookmark Review | Bottom | Top
Update... July 27, 2004
Reviewer: Peter Tran from Houston, TX USA
Hi Tom,
I wanted to give you an update on progress. The nice thing about your approach
is I can monitor the progress, but it's not as fast I as thought it would be.
I started 6 sessions and gave each session a sort_area_size = 200M. I then
executed a month's worth of insert each session. It's taking about 30 minutes
per partition/day to insert from the source daily table to the monthly table.
Each partition holds around 700K to 800K rows.
Should I expect to run this long?
Thanks,
-Peter
Followup:
sounds long -- can you run one of the sessions with a 10046 level 12 trace and
see what they might be waiting on?
GOTO a page to Bookmark Review | Bottom | Top
sort_area_size, pga_aggregate_target, insert /*+ APPEND*/ July 28, 2004
Reviewer: Dusan Valasek from Czech republic
Hi Tom,
I have two questions about setting of big sort_area_size before direct path
load.
1) How does setting of sort_area_size (from session) affect direct path load? I
thought that this is useful for sort operation and building of indexes.
2) If I use pga_aggregate_target, does setting of sort_area_size (from session)
override pga_aggregate_target?
Thanks,
Dushan
Followup:
1) you are loading an IOT. big sort going on.
2) you would set the workarea policy to manual (in your session) to allow you to
override the sort area size for this "batch operation"
GOTO a page to Bookmark Review | Bottom | Top
July 28, 2004
Reviewer: A reader
Would you suggest to re-create the table when other users want to update it
online
Followup:
huh?
GOTO a page to Bookmark Review | Bottom | Top
How to change a DATATYPE of column. July 30, 2004
Reviewer: peru
Hi Tom,
How to change a datatype of a particular column.
say a table with column VARCHAR2 for date.
Now i want to change the datatype DATE.
Assume that table has many records and referenced with
tables , procedures,triggers.
Thank you
Followup:
not really going to happen.
you can add a new column that is a date, update this new column to be to_date()
of the varchar2 and set the varchar2 to null, drop the old column and rename the
newly added column to the old name.
GOTO a page to Bookmark Review | Bottom | Top
Update to July 27, 2004 August 22, 2004
Reviewer: Peter Tran from Houston, TX USA
Hi Tom,
Sorry it took awhile to get back with you on this. You wanted to see a 10046
level 12 trace. I wanted to do some research first and I had to rebuild the
table to reproduce the step.
Here's the trace.
insert /*+ append */ into odfrc_monthly select * from odfrc
partition(ODFRC_20040201)
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.01 0 0 0 0
Execute 1 15.08 5776.28 11657 15874 111333 569939
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 2 15.09 5776.29 11657 15874 111333 569939
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 34
Rows Row Source Operation
------- ---------------------------------------------------
1 LOAD AS SELECT
569939 INDEX FAST FULL SCAN ODFRC_PK PARTITION: 185 185 (object id 12351)
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
control file sequential read 8 0.00 0.00
enqueue 1911 3.00 5726.43
db file sequential read 38 0.03 0.20
db file scattered read 380 0.63 9.50
direct path write 19 0.00 0.00
direct path read 10 0.00 0.00
log buffer space 45 0.61 12.01
log file switch completion 2 0.28 0.33
log file sync 1 0.00 0.00
SQL*Net message to client 1 0.00 0.00
SQL*Net message from client 1 0.00 0.00
Jonathan Lewis wrote in his book that you cannot use direct-path insert using
the APPEND hint into an IOT table. Can this be the reason for the large enqueue
timed event? Are the 6 sessions waiting to lock the index to modify it?
Thanks,
-Peter
Followup:
Umm, you were blocked by someone else for an excessively long period of time
here. Enqueue waits:
enqueue 1911 3.00 5726.43
normal "blocker, blockee" problem here! doh, it was me doing it to you.
only one session at a time can append. only one session at a time can direct
path insert into a table.
sorry -- use normal insert, I goofed.
GOTO a page to Bookmark Review | Bottom | Top
delete 2M records without dropping table. August 23, 2004
Reviewer: Sean from NJ, USA
Hi Tom,
I have to delete 2M records from 6M records and the table has about 400 columns
(Oracle 920, Solaris 9).
I understand that your suggestion of creating temp table with the records I
needed, then drop the original table and change the temp table name.
But since our table is not that big and the application is using this table all
the time, we try to use traditional delete method to accomplish this.
(1)If we have enough redo tablespace, what is the good number of rows for each
delete. I tried each commit for 10000 records or 1000 records. Both are quite
slow.
(2)Anything you can suggest to improve the speed of the delete in the following
procedure.
Thanks so much for your help
------------------------------------------------------------
create or replace procedure sp_del_audit(p_total_loop_time number)
is
total_loop_time number := 1;
BEGIN
WHILE total_loop_time <= p_total_loop_time
LOOP
delete /*+ parallel */ from audit a
where rownum < 10000
and exists ( select /*+ parallel */ null from audit_temp b
where a.Pkey = b.Pkey and
a.row_type = b.row_type and
a.update_time = b.update_time);
commit;
total_loop_time := total_loop_time + 1;
END LOOP;
END;
/
------------------------------------------------------------
Followup:
1) assuming you meant UNDO tablespace, there is no such thing as a REDO
tablespace...
removing 2 million rows, with 400 columns (every row in this table is CHAINED,
when you have more than 255 columns...) and probably many indexes -- it is not
going to be what you might term "speedy".
2) do it as a SINGLE delete (deleting 10,000 records in "parallel" doesn't even
begin to make sense here. PQ is for BIG BIG BIG things. this is tiny)
delete from audit
where (pkey, rowi_type, update_time) in ( select pkey, row_type, update_time
from audit_temp);
nothing more, nothing less. but don't expect this to be super fast if this
table is indexed. removing 33% of the index entries is going to take some
amount of time.
GOTO a page to Bookmark Review | Bottom | Top
Deleting 90% of rows October 13, 2004
Reviewer: A reader
As part of some ETL processing, I need to delete 90% of the rows in a large
(6million rows) table. This table has a PK index so traditional delete takes a
long time. I see the following options
1. Mark the index unusable, delete and rebuild index nologging. This is
significantly faster than plain delete
2. create new table as remaining 10% of the rows, drop, rename, etc
3. Precreate a regular or GTT and do insert /*+ append */ in my batch program.
(2) is not very attractive as I dont want to be doing DDL in my code.
How would you compare (1) and (3) above? Both of them would generate similar
amounts of undo/redo, take roughly same amount of time?
Thanks
Followup:
ETL is 'special', it doesn't matter 5 times a minute.
I'd go with #2 actually, CTAS a newone, drop oldone, rename newone to oldone.
#3 is "dangerous" -- you could lose everything in the event of a failure.
GOTO a page to Bookmark Review | Bottom | Top
October 14, 2004
Reviewer: A reader
"#3 is "dangerous" -- you could lose everything in the event of a failure"
Why would I lose everything? What kind of failure? Instance or media failure?
Oracle guarantees recoverability of commited transactions, right, why do you
bring that up here?
The only reason I dont want to do (2) is that I usually want to prevent DDL in
my code. The only difference between (2) and (3) is that the table is already
created in (3), right? Yes, (3) would generate much more redo/undo (lesser if I
create the table as nologging), but I thought using a GTT would cut down on the
redo/undo as well?)
Followup:
i saw "use a gtt"
i could only assume you mean:
insert rows to keep into gtt
truncate table
insert rows from gtt into the table
that would be dangerous. if you fail after the truncate, before/during the
insert -- gone, all gone. gtt would be dangerous here.
(truncate is ddl btw...)
GOTO a page to Bookmark Review | Bottom | Top
How can you Display multiple rows in one record October 14, 2004
Reviewer: Mack from USA
Hi Tom,
Let's suppose in deptno 10, there are 3 to 5 employees, I want to see the emp
names like
TOM, KIM, JOHN... and so on.
Is there an easy way to do it in SQL? Not through PL/SQL. The number of records
are unknown, it could have 10, 20 or one hundred. Please advise.
Followup:
collect() in 10g
stragg (search for it -- nasty plsql you install once and use over and over and
over and over and over) in pre 10g
GOTO a page to Bookmark Review | Bottom | Top
Query first and then update October 15, 2004
Reviewer: A reader
We have two tables with approximately 28 million and 35 million records
respectively. These tables are joined to produce data to be displayed to users
using IE browsers. Based on this information, around 10000 records get added to
these tables and around the same updated daily. Our SLA is to display each
screenful (100 rows) in 2-4 seconds.
While partitioning is being reviewed to improve the performance for the queries,
could you let us know if there are any issues regarding partitions? For instance
someone has reported that using global indexes on a partitioned table has
degraded the performance.
Followup:
i seriously doubt partitioning is going to be used to increase the performance
of these queries.
partitioning -- great for speeding up a full scan. are you suggesting to full
scan and still return in 2-4 seconds?
"someone reported using global indexes ... degraded performance"
I just about fell out of my chair on that one. geez.
If you have my book "Effective Oracle by Design" -- I go into the "physics
behind partitioning".
In order to return the first 100 rows to a web based application -- you are
doing to be using indexes (or you are not going to be doing what you signed up
to do -- funny, you have an SLA in place but no idea if you can live up to
it...). whether the tables are partitioned or not probably won't have any
bearing on making this faster. Partitioning could certainly make things slower
if you do it wrong, but in a system like this, you would be looking at
partitioning to make administration easier and hopefully not NEGATIVELY impact
performance.
given two tables...
index access
get 100 rows
I personally would be shooting for well under 1 second response times for
everything -- regardless of whether there was 1 row or 1 billion.
don't get the tie into "query first and then update" though.
GOTO a page to Bookmark Review | Bottom | Top
October 18, 2004
Reviewer: A reader
"Our SLA is to display each *screenful* (100 rows) in 2-4 seconds."
In our case the queries, after joining 2 tables of 28 million and 35 million
records, could return 10 screens or 600 screens or many more screens based on
the query parameters. Each screenful (with 100 records in each screen) should
appear in 2-4 seconds.
Followup:
so, you've signed up for an SLA you have no idea if you can meet.
but hey -- using indexes to retrieve 100 rows from 1, 1000, 10000000000000
should be about the same amount of time (and way under subsecond).
but -- getting to "screen 600" is not. Look to google as the gold standard for
searching and web pagination
o totally estimate the number of returned rows -- don't even THINK about giving
an accurate count
o don't give them the ability to go to page "600", pages 1-10 is more than
sufficient
o even if there is a page 600 -- realize it doesn't make sense to go there (no
human could know "what I need is on page 600 -- 6,000 rows into this result
set). Google stops you at page 99
o understand that page 2 takes more time to retrieve than page 1, page 50 more
than 2, and so on (as you page through google --each page takes longer)
But perhaps most importantly -- laugh at people that say things like:
"For instance
someone has reported that using global indexes on a partitioned table has
degraded the performance."
and say -- "yes, it has when improperly applied it can degrade performance,
there have also been sightings of systems where it didn't affect performance at
all, there have also been reported instances of them massively improving
performance. Now what we have to do is understand how the feature works, what
it does, and how it might apply to our problem"
The neat thing about that paragraph -- it is INFINITELY reusable. You can use
it with regards to any feature!
(if you have effective Oracle by design -- i go into the "physics" of
partitioning and who -- without the judicious use of global indexes, your system
could fall apart and run really really slow as well)
GOTO a page to Bookmark Review | Bottom | Top
Senior Statistical Analyst October 18, 2004
Reviewer: Hank Freeman from Atlanta, GA
18 October 2004
Tom,
Thanks for the lob_replace code !!!
It worked wonders when trying to fix about 25,000 clobs with a known error.
Here my detailed discussion in outline and then in detail.
1. What type of data was is in the CLOB
2. What went wrong to create the error
3. What was done to correct it.
a. Stored Proc
b. Declare syntax
c. VBA code in Excel
1. What type of data was is in the CLOB. The company has about 45,000
internet/intranet web pages stored in an Oracle database table field which is a
CLOB. Meaning instead of calling the website from a file server or physical
location the site, the entire website-webpages/html source code is held in the
table at this CLOB field.
2. What went wrong to create the error. An error occurred when these records
were being modified to remove a specific piece of information and a replacement
null character was inserted. The null character for some unknown reason did not
work and the web-page information in the CLOB field got garbage appended to the
end of the web-page after the closing