Hi, Is it enough if we use ALTER TABLE CACHE; to push the table to cache ?. Do I need to do anything else to have the table in the Memory ? Thx, VJ
That command simply makes a notation in the data dictionary that blocks from this table should be handled differently when they get into the cache. Usually, when we full scan an object, we put the blocks onto the least recently end of the list. These blocks are candidates for "aging" from the buffer cache. By altering the table to 'cache', we put the blocks onto the most recently used end -- making them less prone to being aged out of the buffer cache. From the sql reference manual we see: ... for data that is accessed frequently, specifies that the blocks retrieved for this table are placed at the most recently used end of the LRU list in the buffer cache when a full table scan is performed. This attribute is useful for small lookup tables. ... So, no -- issueing the alter table ... cache command does not put the table into memory, you must full scan the table for that to happen. Consider looking into setting up multiple buffer pools and associating this table with its own buffer pool if you are dead serious about caching it. Bufrer pools are Schema objects are referenced with varying usage patterns; therefore, their cache behavior may be quite different. Multiple buffer pools enable you to address these differences. You can use a KEEP buffer pool to maintain objects in the buffer cache and a RECYCLE buffer pool to prevent objects from consuming unnecessary space in the cache. When an object is allocated to a cache, all blocks from that object are placed in that cache. Oracle maintains a DEFAULT buffer pool for objects that have not been assigned to one of the buffer pools. Bear in mind tho... there is really no true way to a have a purely "in memory" table -- even cached tables are subject to aging from the buffer cache.
Reviews | |
---|---|
a
Bookmark | Bottom | Top
Reviewer:
A reader
Hi Tom, What is your approach to find which tables to cache in SGA? In a properly tuned system, this caching should be unnecessary and won't give much benefit, is it correct? Thanks, Followup October 27, 2003 - 6pm Central time zone: you normally should not ever need to use this, it only comes into play when you full scan a table that exceeds the short table threshold -- causing us to cache it differently. it is rare to do that. you would know you are doing it. i've used it once in my experience.
a
Bookmark | Bottom | Top
Reviewer:
A reader
Thanks Tom. Do you mean that I should use "alter table ... storage (buffer_pool keep)" instead of "alter table ... cache", for small frequently-accessed look-up tables? Followup October 27, 2003 - 7pm Central time zone: not really -- they were always "fully cachable" as is. I ascribe to the 99/1 rule (variation on the 80/20 rule). the buffer cache is fairly efficient. If it was meant to be cached, and the memory exists, it will be cached. small, frequently accessed lookup tables would be in the cache by "popular demand" 99% of the time already.
a
Bookmark | Bottom | Top
Reviewer:
Dave from Colorado
One of the situations for getting involved in setting the buffer pool attribute might be in a partitioned fact table -- you might choose to set the most recent few partitions to "keep", and the older ones to "recycle", thus helping to prevent a more rare scan of older data from spoiling the day of the majority of users who access recent data.
a
Bookmark | Bottom | Top
Reviewer:
Duncan from UK
From what was said above am I to take it that putting large tables in the buffer keep pool is going to be pointless because if they take up most of the pool then they'll get aged out as soon as a new object requires the space in memory to be cached? Followup October 28, 2003 - 8am Central time zone: no? that is not what it means. blocks are aged out of buffer pools based on how often they are used. So, if you put large tables in a keep pool AND you use their blocks lots (more then other blocks), they'll stay in there.
a
Bookmark | Bottom | Top
Reviewer:
A reader from Santo Domingo, Dominican Republic
Is there any sample about the performance impact of caching tables. Followup October 28, 2003 - 10am Central time zone: blocks are always cached. the alter table cache command is somewhat of a misnomer. it simply changes the way a full scan of a LARGE table would be treated in the buffer cache. It doesn't "cache a table". Performance is relative. alter table cache can kill performance. alter table cache can enhance positively performance. alter table cache can have zero effect on performance. it is a tool, that can sometimes - in rare cases (eg: example posted by another reader -- Dave -- above) be used.
a
Bookmark | Bottom | Top
Reviewer:
Christo Kutrovsky from Ottawa, ON Canada
Tom, You mention the "short table trheshold" what is it ? According to Oracle documentations it should be 5 database blocks. However I have a table that uses 935 blocks, and is 37 000 rows, and when I full scan it the "table scans (short tables)" statistic is incremented. What are your observations? Followup February 16, 2004 - 1pm Central time zone: http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:255215154182 http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96533/hintsref.htm#15719
a
Bookmark | Bottom | Top
Reviewer:
Christo Kutrovsky from Ottawa, ON Canada
Thanks.
a
Bookmark | Bottom | Top
Reviewer:
Kyle from VA, USA
Thanks for posting this!
a
Bookmark | Bottom | Top
Reviewer:
A reader
Hi Tom, This is a question for you :) I have a value that is in a table, but this value is not going to change in all database life. I can keep the table in memory but even then, Oracle will have block gets to get that value. I can use a package begiend, a better solution, but per session there will be a read to the table in memory. My question is can I put a value in memory for public access. So I read the value at database startup and from then it is read from memory from all session? Thanks Tom Followup May 18, 2004 - 7am Central time zone: that package variable is the solution you are looking for. You are over analyzing the situation. the "read to the table in memory" isn't anything to be concerned about.
a
Bookmark | Bottom | Top
Reviewer:
A reader
a
Bookmark | Bottom | Top
Reviewer:
Anurag from INDIA
Hi Tom, I am not clear about the difference between putting table to cache with alter table ....cache command and using execute ....keep('table') Can you please explain the difference actually what happens during both? best regards, Anurag Followup December 14, 2004 - 9am Central time zone: what is "execute ....keep('table')" exactly? if you mean altering the table to be in the keep pool, then read on. alter table cache's behavior is described above, for the original answer. putting a table into the keep pool just changes the buffer cache where the blocks are cached. Instead of the blocks going into the default buffer pool -- they go into the keep buffer pool. Otherwise, the behaviour is precisely the same as just a "normal table". All it does is move where the blocks are placed -- the keep pool instead of the default pool.
a
Bookmark | Bottom | Top
Reviewer:
Anurag from INDIA
Thanks Tom, Well, I may be cloudy at my concepts. How the block of same table behaves if it is kept in keep pool rather cached. Or is it the same behaviour wise. Can you please illustrate this. Followup December 15, 2004 - 1pm Central time zone: when you full scan a long table (bigger than say DUAL) blocks resulting from that full scan are eligible for aging out of the buffer cache right away. This prevents a single big full scan from WIPING OUT your buffer cache. When the full scan needs more space in the buffer cache for the blocks it is reading, the ones that get pushed out are the ones the full scan just put in there -- instead of all of the other data. If you alter table t CACHE, you simply change that behaviour. The blocks read from that long full table scan are not subject to that behaviour. they WILL not be the first ones "out". So, a big full scan (say by accident) would in effect flush your buffer cache if the table was "cache". With a keep pool -- an orthogonal concept from CACHE -- the two have really not much to do with eachother at all -- you are saying "when you do put a block from this table into the buffer cache, please put it over here, in this keep pool -- NOT in the default cache" So, if a table is set to point to a keep pool, it behaves NOT ANY DIFFERENTLY than a table not pointing to the keep pool, it is just that its blocks will be put in the keep buffer pool instead of the default. Don't even try to compare these two things, they have not much to do with eachother at all. CACHE -- where in the "LRU" (conceptually speaking) the block read in from disk goes in the buffer cache KEEP/RECYCLE/DEFAULT -- three buffer pools you may optionally set up.
a
Bookmark | Bottom | Top
Reviewer:
Shimon Tourgeman
a
Bookmark | Bottom | Top
Reviewer:
A reader
hi tom, why doesn't oracle provide named caches? eg. if there are say 10 applications in the database wouldn't it be more efficient for each application to have also 10 seperate caches? the lio / pio behavior & impacts would be per application. they wouldn't influence each other. regards, max Followup January 11, 2005 - 10am Central time zone: No, in my experience what you would achieve is "gotta buy tons more ram in order to do this since each application refuses to share what they have with anyone else" there are up to 7 'caches' now, more than two (actually, in most all cases more than one) seems overkill.
a
Bookmark | Bottom | Top
Reviewer:
prathima from USA
a
Bookmark | Bottom | Top
Reviewer:
Randall from CA USA
Tom; Recently you recommended considering setting up a "keep" pool. It is my understanding that there are no keep or recycle pools for alternate buffer caches, only for the default block size cache. Therefore designating a table as "keep" will break if the table is moved to an alternate block size tablespace later for tuning purposes. However, creating it as a CACHE type table should work regardless of the block size. At least that is what I am telling my students these days. What do you think? Followup February 7, 2005 - 4am Central time zone: where did I recomend that? but the keep/recycle pools are just "alternate" buffer caches -- like a different sized block one would be... eg: another way to have a "keep pool" would be to create a non-standard blocksize tablespace and move the object into that. You achieve basically the same effect.
a
Bookmark | Bottom | Top
Reviewer:
A reader
hi Tom, Table UBER_INTERVAL currently has only 146 blocks and since it is a small table, Oracle is doing a FTS for a query on this table, even though the same can be achieved in fewer 'consistent gets' by using a bitmap index. FTS consistent gets : 152 bitmap index consistent gets : 4 My question : How do I artificially make this table NOT to belong to this 'small table' category ? Even after setting the 'number of blocks' to 10000 using the below dbms_stats, there is no effect(it is still using a FTS, for the query) and the consistent read is still 151 in sql_trace, not sure why the dbms_Stats is not having any effect begin dbms_Stats.set_table_stats(ownname=>'MEC_USER',tabname=>'UBER_INTERVAL_C',numblks=>10000); end; SELECT uber_interval FROM uber_interval_c a WHERE interval_type_code = :"SYS_B_0" call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.06 0.15 0 304 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 2 0.08 0.29 0 151 0 2 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 0.14 0.45 0 455 0 2 Misses in library cache during parse: 1 Optimizer goal: CHOOSE Parsing user id: 20 Rows Row Source Operation ------- --------------------------------------------------- 2 TABLE ACCESS FULL OBJ#(1837479) (cr=151 r=0 w=0 time=299577 us) db version 9203 db_cache_size big integer 738197504 I guess there should be some undocumented parameter for setting this small table threshold ? Why is the dbms_Stats.set_table_stats not having any effect( as shown in sql_trace ? thanks Anto Followup February 9, 2005 - 1am Central time zone: do you care to share a test case (create table, insert into table select..., create index, dbms_stats.gather, etc) just like I would. and why are you using cursor_sharing -- seems like you have a bug in the code that needs fixing soon so you can turn that OFF. (this is probably more along the lines of "over binding is my problem" rather than stats. oracle will not full scan a table regardless of the size if the cardinality to be retrieved is small -- so, i'm thinking this is "cursor sharing" -- NOT "small table threshold" set table stats, i don't see how you can say "as shown in sql_trace", i see nothing in there that would indicate anything?
a
Bookmark | Bottom | Top
Reviewer:
A reader
corrrection : the db version is 9204 not 9203 and I did a flush of shared_pool before the 2nd run(after setting dbms_Stats)
a
Bookmark | Bottom | Top
Reviewer:
A reader
Hi Tom, You are right, after setting the following (cardinality for the column in where condition) begin dbms_Stats.set_column_stats(ownname=>'MEC_USER',tabname=>'UBER_INTERVAL_C',colname=>'INTERVAL_TYPE_C ODE',distcnt=>1000); end; the optimizer did use the bitmap index instead of going for a full table scan(FTS). Here are some other things I noted a) Cursor_sharing(whether it was the default (exact) or Similar) - did not make any difference to the above ( I did flush the shared pool each time, otherwise the cursor_sharing was not having any effect) b) I had to increase both the numblks(for table) as well as distcnt( for the column in the where condition), for the optimizer to choose the index instead of FTS. So I assume the 'Small table threshold' was actually coming into play here. Is my assumption right ? For my previous post, it should have been tkprof not sql_trace. Also my question(why tkprof was still showing 151 for cr- consistent reads, even after setting table_stats) was stupid, since sql_trace or tkprof always shows the actual values irrespective of the table or column stats. Sorry about that Thanks for your help Anto Followup February 9, 2005 - 2pm Central time zone: so basically, the optimizer just didn't have "correct data" give it the right stuff and it works. give it incomplete or stale stuff and it doesn't small table threshold is not coming into play. cardinalities, rows retrieved, amount of data to process - they are. small table threshold has to do with how the blocks are placed into the cache.
a
Bookmark | Bottom | Top
Reviewer:
A reader
And raising question in your site is far far better than raising TARs with oracle - both from the point of view of response time as well as for getting the workarounds/usefulness. Metalink.oracle.com is really useful,but not raising TARs with oracle, maybe it depends on the type of support our client is having with oracle. thanks Anto
a
Bookmark | Bottom | Top
Reviewer:
A reader
I was under the impression that if the number of blocks in a table is below some threshold value, Oracle will always do a FTS, irrespective of statistics, even if index access might be cheaper. Anto Followup February 9, 2005 - 3pm Central time zone: FALSE.. bzzt. provably false: ops$tkyte@ORA9IR2> create table t ( x int primary key, y int ); Table created. ops$tkyte@ORA9IR2> insert into t values ( 1, 1 ); 1 row created. ops$tkyte@ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T', cascade=>TRUE ); PL/SQL procedure successfully completed. ops$tkyte@ORA9IR2> set autotrace traceonly explain ops$tkyte@ORA9IR2> set linesize 121 ops$tkyte@ORA9IR2> select * from t where x = 1; Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=6) 1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=1 Card=1 Bytes=6) 2 1 INDEX (UNIQUE SCAN) OF 'SYS_C005654' (UNIQUE) it has never been true...
a
Bookmark | Bottom | Top
Reviewer:
A reader
Thanks, Tom for the clear example. Not sure how that wrong notion came into my head
a
Bookmark | Bottom | Top
Reviewer:
A Reader from India
Hi, Why index scan is cheaper for 1 record from 1 block table. Thanks Followup October 1, 2005 - 9pm Central time zone: cheaper than what?
a
Bookmark | Bottom | Top
Reviewer:
A Reader from INIDA
Cheaper than Full Table Scan. Followup October 2, 2005 - 10am Central time zone: test it and see. tablespace ASSM is auto segment space managed tablespace MSSM is manual segment space managed Less LIO's with the index - we can read the single index block, the single table block. With ASSM we read all blocks below the high water mark which is always advanced "high" (part of the design) as well as blocks that represent the extent map. With MSSM we read all of the blocks below the high water mark as well as blocks that represent the extent map. ops$tkyte@ORA9IR2> create table t ( x int constraint t_pk primary key, y int ) tablespace assm; Table created. ops$tkyte@ORA9IR2> insert into t values ( 1, 2 ); 1 row created. ops$tkyte@ORA9IR2> ops$tkyte@ORA9IR2> set termout off ops$tkyte@ORA9IR2> ops$tkyte@ORA9IR2> set autotrace traceonly statistics ops$tkyte@ORA9IR2> select * from t where x = 1; Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 2 consistent gets 0 physical reads 0 redo size 424 bytes sent via SQL*Net to client 499 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed ops$tkyte@ORA9IR2> select * from t where x = 1; Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 7 consistent gets 0 physical reads 0 redo size 424 bytes sent via SQL*Net to client 499 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed ops$tkyte@ORA9IR2> set autotrace off ops$tkyte@ORA9IR2> ops$tkyte@ORA9IR2> drop table t; Table dropped. ops$tkyte@ORA9IR2> create table t ( x int constraint t_pk primary key, y int ) tablespace mssm; Table created. ops$tkyte@ORA9IR2> insert into t values ( 1, 2 ); 1 row created. ops$tkyte@ORA9IR2> ops$tkyte@ORA9IR2> set termout off ops$tkyte@ORA9IR2> ops$tkyte@ORA9IR2> set autotrace traceonly statistics ops$tkyte@ORA9IR2> select * from t where x = 1; Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 2 consistent gets 0 physical reads 0 redo size 424 bytes sent via SQL*Net to client 499 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed ops$tkyte@ORA9IR2> select * from t where x = 1; Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 3 consistent gets 0 physical reads 0 redo size 424 bytes sent via SQL*Net to client 499 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed ops$tkyte@ORA9IR2> set autotrace off ops$tkyte@ORA9IR2>
a
Bookmark | Bottom | Top
Reviewer:
A Reader from India
a
Bookmark | Bottom | Top
Reviewer:
Markus from Austria
dear tom, you mentioned a few comments before: "When the full scan needs more space in the buffer cache for the blocks it is reading, the ones that get pushed out are the ones the full scan just put in there -- instead of all of the other data." this sounds like the fts reuses "his own" buffers. say our buffer cache is 10000 blocks in size. say we are starting a fts over a table 3000 blocks in size. what happens? how many of the original 10000 block buffers will be effectivly aged out and used by this fts? Followup October 3, 2005 - 11am Central time zone: depends on how full the cache is to begin with and what else is going on. No fixed "fast rule"
a
Bookmark | Bottom | Top
Reviewer:
Markus from Austria
dear tom, thanks for your answer. could you please explain how a fts decides to reuse "his own" buffers? Followup October 4, 2005 - 3pm Central time zone: it is based on the way they are added to the cache, they go out at the ends, not in the middle - so when a buffer needs to be reused - they look like the oldest buffers.
a
Bookmark | Bottom | Top
Reviewer:
Markus from Austria
tom, i thought about that. it think the fts is done "db_multiblock_read_count"-wise: 1. the session searches for free buffers 2. the free buffers are pinned 3. one time "db_multiblock_read_count" datablocks are read into this buffers 4. some operations occur on this data 5. the buffers are marked free 6. the session searches for free buffers (finding the ones recently used by itself) 7. ... it that the way it works? or am i totally wrong? Followup October 5, 2005 - 7am Central time zone: not really, think of it like this when we full scan a large table (we have thresholds to define "large" as a percent of the buffer cache), we'll put the blocks on the "OLD" end of the list, instead of the "NEW". OLD blocks age out faster. Since these new, yet marked as OLD blocks are on the OLD end of the list, they will be the first to go when we need more space. They will not kick the "NEW" blocks off of the list....
a
Bookmark | Bottom | Top
Reviewer:
Markus from Austria
hi tom, thanks for your answer but you misunderstood. my question was and is still related to your reply found above "...the ones that get pushed out are the ones the full scan just put in there...". let me try to refine. i know that the fts places its blocks at the cold end of the lru list per definition. but how can it happen that the same fts which places its blocks into buffers at the cold end of the lru list can REUSE these block buffers recently used by ITSELF? the fts is aging out blocks read-in by itself!? that's why i was thinking the data is read "db_multiblock_read_count" wise. because it has to be read in chunks. the fts would not be able to reuse buffers recently used by its own otherwise. please, could you explain on this, especially this "chunk" algorithm? Followup October 5, 2005 - 11am Central time zone: you have a buffer cache.... you are full scanning.... In order to do so, you need a free block. So, where do you get it? From the "cold end" as you say. You use it, you put it on the "cold end" (large table full table scan....) Later you need another block - where do you get it? from the cold end - you know, right where you put the last stuff.... The fts is aging out blocks read in by itself - precisely. so what if it is read in in chunks - you read 8 blocks - what do you need? You need 8 free blocks - go bump out others you already read and processed.
a
Bookmark | Bottom | Top
Reviewer:
Markus from Austria
yes, yes, yes, come on. could you please go into depth on that? what i mean is that there must be an algorithm like "read in a chunk of blocks" - "process this chunk of blocks" - "read in another chunk of blocks" (LEADING TO A REUSE OF THE BLOCK BUFFERS RECENTLY USED BY ITSELF) - "process this chunk of blocks" - "read in ..." a fts is done with multiblock i/o not single block i/o per definition. that's why i thought it might be done "db_multiblock_read_count" wise. who decides how many blocks are read in? how many blocks are read in per chunk? is it "db_multiblock_read_count", is it a threshold, is it a formular? because only this "chunk wise thing" leads to the reuse of block buffers recently used by itself. could you please explain this "chunk wise thing" algorithm? Followup October 6, 2005 - 7am Central time zone: no no no no no, because - it frankly doesn't really matter to us. All *we* need to understand is that when we full scan - we won't wipe out the buffer cache. that is it pretty much. The internals - they change. We've used the term "lru" to conceptually describe the process, well, it isn't an LRU, hasn't been since 8i, it uses a touch count. We can conceptually understand what happens, the actual mechanics, not relevant and no I won't go further into it. db_file_multiblock_read_count is documented to be the deciding that that controls the maximum size of a read, yes. That is what it does. there is no true "chunk wise thing", all things are done at the block level in the buffer cache - a multi-block IO leads to a wait for "db file SCATTERED read" meaning the blocks are SCATTERED in the buffer cache (we hash their DBA's data block addresses to figure out what LIST of cached blocks - of which there are many - to put them on. They are not treated as a "chunk", they do not travel together). If you want to understand some of the mechanics of the cache, how Oracle uses memory and such - I did write about that in some depth in my latest book (and in expert one on one Oracle - but things change, that one only covers up to 8i, the new one up to 10g)
a
Bookmark | Bottom | Top
Reviewer:
Markus from Austria
hi tom, ok, i will take a look at your book "expert one on one oracle", first edition, i have. should i simply search for "cache management" or is there a special page number i should look at? by the way, it would be hot - i think - if we could redirect blocks read in by a fts to the recycle buffer "on the fly"(*) using a hint. eg. "select ...". might be a nice option for hybrid systems. (*) NO, i don't want to change the definition "on the fly" using dynamic sql. ;) Followup October 6, 2005 - 11am Central time zone: why? we'd never be able to FIND the darn things again if you told us dynamically what cache to put them in. that, and it is not necessary, since, well, the algorithm already says "don't flush out other stuff - flush out the full scanned blocks first anyway" Suggest reading the first 1/3 of Expert :) it is sort of my version of the concepts guide. Buffer cache stuff is sprinkled throughout.
a
Bookmark | Bottom | Top
Reviewer:
A reader
Many performance manuals indicate that if a "large" table is scanned/used frequently, it may benefit to configure a KEEP buffer. I understand as you say "if you put large tables in a keep pool AND you use their blocks lots (more then other blocks), they'll stay in there." Assuming the size of a particular table is larger than the KEEP pool itself, what impact does this impose? Is there still a benefit to utilize the KEEP pool? Followup November 11, 2005 - 10am Central time zone: the caching of the blocks depends on how the blocks are retrieved. a large table read via single block IO (index reads) will be cached one way (the blocks will not age out really fast). I would not suggest the arbitrary use of the KEEP pool unless you had identified a problem (IO wise) that could be fixed by using it. Have you?
a
Bookmark | Bottom | Top
Reviewer:
A reader
I was just curious from a theoretical perspective. Assuming the blocks are retrieved via a full table scan and the table itself is much larger than the size of the KEEP pool. Would it really benefit to configure the KEEP pool for this table? Followup November 13, 2005 - 10am Central time zone: yes, no, maybe, it depends. Do the blocks need to be kept cached? should they be cached? are you using parallel query (more common with large full table scans) - then the cache doesn't really matter (direct io). some of the blocks would be found in the buffer cache after the full scan. so, some of the blocks may be available for other queries. In general, I'd rather have a reason for using the non-default pools, rather than hypothesize all of the possible reasons you may or may not have for using it. Here is something I've written on this recently in Expert Oracle Database Architecture: Block Buffer Cache So far, we have looked at relatively small components of the SGA. Now we are going to look at one that is possibly huge in size. The block buffer cache is where Oracle stores database blocks before writing them to disk and after reading them in from disk. This is a crucial area of the SGA for us. Make it too small and our queries will take forever to run. Make it too big and we’ll starve other processes (e.g., we won’t leave enough room for a dedicated server to create its PGA, and we won’t even get started). In earlier releases of Oracle, there was a single block buffer cache, and all blocks from any segment went into this single area. Starting with Oracle 8.0, we had three places to store cached blocks from individual segments in the SGA: * Default pool: The location where all segment blocks are normally cached. This is the original—and previously only—buffer pool. * Keep pool: An alternate buffer pool where by convention you would assign segments that were accessed fairly frequently, but still got aged out of the default buffer pool due to other segments needing space. * Recycle pool: An alternate buffer pool where by convention you would assign large segments that you access very randomly, and which would therefore cause excessive buffer flushing but would offer no benefit because by the time you wanted the block again it would have been aged out of the cache. You would separate these segments out from the segments in the default and keep pools so that they would not cause those blocks to age out of the cache. Note that in the keep and recycle pool descriptions I used the phrase “by convention.” There is nothing in place to ensure that you use neither the keep pool nor the recycle pool in the fashion described. In fact, the three pools manage blocks in a mostly identical fashion; they do not have radically different algorithms for aging or caching blocks. The goal here was to give the DBA the ability to segregate segments to hot, warm, and do not care to cache areas. The theory was that objects in the default pool would be hot enough (i.e., used enough) to warrant staying in the cache all by themselves. The cache would keep them in memory since they were very popular blocks. You might have had some segments that were fairly popular, but not really hot; these would be considered the warm blocks. These segments' blocks could get flushed from the cache to make room for some blocks you used infrequently (the “do not care to cache” blocks). To keep these warm segments blocks cached, you could do one of the following: * Assign these segments to the keep pool, in an attempt to let the warm blocks stay in the buffer cache longer. * Assign the “do not care to cache” segments to the recycle pool, keeping the recycle pool fairly small so as to let the blocks come into the cache and leave the cache rapidly (decrease the overhead of managing them all). This increased the management work the DBA had to perform, as there were three caches to think about, size, and assign objects to. Remember also that there is no sharing between them, so if the keep pool has lots of unused space, it won’t give it to the overworked default or recycle pool. All in all, these pools were generally regarded a very fine, low-level tuning device, only to be used after most all other tuning alternatives had been looked at (if I could rewrite a query to do one-tenth the I/O rather then set up multiple buffer pools, that would be my choice!). Starting in Oracle9i, the DBA had up to four more optional caches, the db_Nk_caches, to consider in addition to the default, keep, and recycle pools. These caches were added in support of multiple blocksizes in the database. Prior to Oracle9i, a database would have a single blocksize (typically 2KB, 4KB, 8KB, 16KB, or 32KB). Starting with Oracle9i, a database can have a default blocksize, which is the size of the blocks stored in the default, keep, or recycle pool, as well as up to four nondefault blocksizes, as explained in Chapter 3. The blocks in these buffer caches are managed in the same way as the blocks in the original default pool—there are no special algorithm changes for them either. Let’s now move on to cover how the blocks are managed in these pools.
a
Bookmark | Bottom | Top
Reviewer:
Jack Douglas from Maidenhead, UK
Hi Tom, We have a large table (95% of the whole database) that is often queried via an index. The problem we have is that even though we access only a very small percentage of the table in a query, it is still a large number of blocks relative to the buffer size. Because we are not doing a full scan am I right that potentially the buffer cache will be flushed by these queries? Is it true that the small table threshold or 'cache' setting will not prevent this happening to any of the blocks currently in the buffer? If so, does that make this an immediate candidate for a 'recycle' pool? Thanks for your input, Jack Followup December 16, 2005 - 8am Central time zone: small table thresholds/cache settings are relevant for full scans. nothing makes anything an immediate candidate for the recycle pool. Have you diagnosed a physical IO problem on your system regarding other objects (beyond this one) Have you considered an index organized table - to reduce the number of blocks required to satisfy an index range scan (instead of a possible a) 3 or 4 blocks to read index b) plus number of blocks = number of rows retrieved for table access by index rowid in worst case you would have a) 3 or 4 blocks to read index to find first row b) plus as few blocks as it can take to store the data ?
a
Bookmark | Bottom | Top
Reviewer:
David Aldridge http://oraclesponge.blogspot.com from Colorado Springs
We were discussing here http://tinyurl.com/cyyyu the effect of read ahead caching on FTS of small tables. Do you think that buffer caching of small tables might be a "valid" method of avoiding i/o inefficiency due to read ahead kicking in towards the end of the table scan and inadvertantly reading past the HWM? Or is that crazy talk? Followup December 16, 2005 - 12pm Central time zone: this reminds me sort of "separate indexes from tables" in a way ;) Might be more reasonable to place "large segments you plan on multi-block IO'ing together and separate them from segments you plan o single-block IO'ing" at the volume level in order to prevent read ahead from kicking in. Reason I say this is - it was observed that separating indexes from data helped a system once. turned out it was the separation of single block IO'ed objects from multi-block IO'ed objects - and it was the suppression of "magic read ahead algorithms" on the single block IO'ed things that made the difference.
a
Bookmark | Bottom | Top
Reviewer:
Jack Douglas from Maidenhead, UK
No, at the moment I only suspect a PIO problem on other objects. I will focus my attention on proving this one way or the other. If it proves to be the case would you then suggest considering splitting the cache? Strangely enough it is IO that prevents us using an IOT. The queries at the end of the script are like a typical query on our system - the fact table is queried by dimension1, dimension2 and also by dimension1, dimension3: SQL*Plus: Release 10.2.0.1.0 - Production on Fri Dec 16 16:28:33 2005 Copyright (c) 1982, 2005, Oracle. All rights reserved. Connected to: Oracle9i Release 9.2.0.6.0 - Production JServer Release 9.2.0.6.0 - Production 16:28:33 TRACKER@oracle> create cluster t_cluster (id1 integer, partial_id2 integer, id3 integer); Cluster created. Elapsed: 00:00:00.01 16:28:37 TRACKER@oracle> create index k_t_cluster on cluster t_cluster; Index created. Elapsed: 00:00:00.01 16:28:37 TRACKER@oracle> create table t1 (id1 integer, partial_id2 integer, id2 integer, id3 integer, dummy char (100)) cluster t_cluster (id1, partial_id2, id3); Table created. Elapsed: 00:00:00.01 16:28:38 TRACKER@oracle> alter table t1 add constraint pk_t1 primary key (id1, id2, id3); Table altered. Elapsed: 00:00:00.04 16:28:38 TRACKER@oracle> create index nu_t1 on t1 (id1, id3); Index created. Elapsed: 00:00:00.03 16:28:38 TRACKER@oracle> create table t2 (id1 integer, partial_id2 integer, id2 integer, id3 integer, dummy char (100), constraint pk_t2 primary key (id1, partial_id2, id3, id2)) organization index; Table created. Elapsed: 00:00:00.04 16:28:38 TRACKER@oracle> create unique index u_t2 on t2 (id1, id2, id3); Index created. Elapsed: 00:00:00.04 16:28:38 TRACKER@oracle> create index nu_t2 on t2 (id1, id3); Index created. Elapsed: 00:00:00.03 16:28:38 TRACKER@oracle> create table t3 (id1 integer, id2 integer, id3 integer, dummy char (100), constraint pk_t3 primary key (id1, id3, id2)) organization index; Table created. Elapsed: 00:00:00.04 16:28:38 TRACKER@oracle> create index nu_t3 on t3 (id1, id3); Index created. Elapsed: 00:00:00.03 16:28:38 TRACKER@oracle> -- 16:28:38 TRACKER@oracle> begin 16:28:38 2 for j in 1..10 loop 16:28:38 3 for i in 1..50 loop 16:28:38 4 insert into t1 (id1, partial_id2, id2, id3, dummy) 16:28:38 5 select j, round (rownum / 50), rownum, i, 'A' 16:28:38 6 from dba_objects 16:28:38 7 where rownum <= 100; 16:28:38 8 -- 16:28:38 9 insert into t2 (id1, partial_id2, id2, id3, dummy) 16:28:38 10 select j, round (rownum / 50), rownum, i, 'A' 16:28:38 11 from dba_objects 16:28:38 12 where rownum <= 100; 16:28:38 13 -- 16:28:38 14 insert into t3 (id1, id2, id3, dummy) 16:28:38 15 select j, rownum, i, 'A' 16:28:38 16 from dba_objects 16:28:38 17 where rownum <= 100; 16:28:38 18 end loop; 16:28:38 19 end loop; 16:28:38 20 end; 16:28:38 21 / PL/SQL procedure successfully completed. Elapsed: 00:01:33.71 16:30:12 TRACKER@oracle> -- 16:30:12 TRACKER@oracle> commit; Commit complete. Elapsed: 00:00:00.03 16:30:12 TRACKER@oracle> -- 16:30:12 TRACKER@oracle> create table tx as select distinct id1, partial_id2 from t1; create table tx as select distinct id1, partial_id2 from t1 * ERROR at line 1: ORA-00955: name is already used by an existing object Elapsed: 00:00:00.01 16:30:12 TRACKER@oracle> -- 16:30:12 TRACKER@oracle> analyze table t1 compute statistics; Table analyzed. Elapsed: 00:00:02.57 16:30:15 TRACKER@oracle> analyze table t2 compute statistics; Table analyzed. Elapsed: 00:00:01.53 16:30:16 TRACKER@oracle> analyze table t3 compute statistics; Table analyzed. Elapsed: 00:00:01.50 16:30:18 TRACKER@oracle> analyze table tx compute statistics; Table analyzed. Elapsed: 00:00:00.03 16:30:18 TRACKER@oracle> -- 16:30:18 TRACKER@oracle> select sum (blocks) as blocks from (select blocks from dba_tables where table_name = 'T1' union all select leaf_blocks from dba_indexes where table_name = 'T1'); -- BLOCKS ---------- 2591 Elapsed: 00:00:00.73 16:30:18 TRACKER@oracle> select sum (leaf_blocks) as blocks from dba_indexes where table_name = 'T2'; -- BLOCKS ---------- 2764 Elapsed: 00:00:00.04 16:30:18 TRACKER@oracle> select sum (leaf_blocks) as blocks from dba_indexes where table_name = 'T3'; -- BLOCKS ---------- 1864 Elapsed: 00:00:00.03 16:30:19 TRACKER@oracle> set autotrace trace explain statistics; 16:30:19 TRACKER@oracle> -- 16:30:19 TRACKER@oracle> select min (dummy) from (select * from t1 where id1 = 5 and id2 = 50 union select * from t1 natural join tx where id1 = 5 and id3 = 25); Elapsed: 00:00:00.18 Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=57 Card=1 Bytes=102) 1 0 SORT (AGGREGATE) 2 1 VIEW (Cost=57 Card=150 Bytes=15300) 3 2 SORT (UNIQUE) (Cost=57 Card=150 Bytes=16600) 4 3 UNION-ALL 5 4 TABLE ACCESS (BY INDEX ROWID) OF 'T1' (Cost=14 Car d=50 Bytes=5400) 6 5 INDEX (RANGE SCAN) OF 'PK_T1' (UNIQUE) (Cost=3 C ard=50) 7 4 NESTED LOOPS (Cost=4 Card=100 Bytes=11200) 8 7 TABLE ACCESS (FULL) OF 'TX' (Cost=3 Card=3 Bytes =12) 9 7 TABLE ACCESS (CLUSTER) OF 'T1' (Cost=1 Card=33 B ytes=3564) 10 9 INDEX (UNIQUE SCAN) OF 'K_T_CLUSTER' (NON-UNIQ UE) Statistics ---------------------------------------------------------- 29 recursive calls 0 db block gets 70 consistent gets 0 physical reads 0 redo size 479 bytes sent via SQL*Net to client 372 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 1 sorts (memory) 0 sorts (disk) 1 rows processed 16:30:19 TRACKER@oracle> select min (dummy) from (select * from t2 where id1 = 5 and id2 = 50 union select * from t2 natural join tx where id1 = 5 and id3 = 25); Elapsed: 00:00:00.18 Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=45 Card=1 Bytes=102) 1 0 SORT (AGGREGATE) 2 1 VIEW (Cost=45 Card=150 Bytes=15300) 3 2 SORT (UNIQUE) (Cost=45 Card=150 Bytes=16600) 4 3 UNION-ALL 5 4 INDEX (UNIQUE SCAN) OF 'PK_T2' (UNIQUE) (Cost=1 Ca rd=50 Bytes=5400) 6 5 INDEX (RANGE SCAN) OF 'U_T2' (UNIQUE) (Cost=1 Ca rd=50) 7 4 NESTED LOOPS (Cost=4 Card=100 Bytes=11200) 8 7 TABLE ACCESS (FULL) OF 'TX' (Cost=3 Card=3 Bytes =12) 9 7 INDEX (RANGE SCAN) OF 'PK_T2' (UNIQUE) (Cost=1 C ard=33 Bytes=3564) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 168 consistent gets 0 physical reads 0 redo size 479 bytes sent via SQL*Net to client 372 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 1 sorts (memory) 0 sorts (disk) 1 rows processed 16:30:19 TRACKER@oracle> select min (dummy) from (select * from t3 where id1 = 5 and id2 = 50 union select * from t3 where id1 = 5 and id3 = 25); Elapsed: 00:00:00.25 Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=46 Card=1 Bytes=102) 1 0 SORT (AGGREGATE) 2 1 VIEW (Cost=46 Card=150 Bytes=15300) 3 2 SORT (UNIQUE) (Cost=46 Card=150 Bytes=15900) 4 3 UNION-ALL 5 4 INDEX (UNIQUE SCAN) OF 'PK_T3' (UNIQUE) (Cost=19 C ard=50 Bytes=5300) 6 5 INDEX (RANGE SCAN) OF 'NU_T3' (NON-UNIQUE) (Cost =19 Card=5000) 7 4 INDEX (UNIQUE SCAN) OF 'PK_T3' (UNIQUE) (Cost=1 Ca rd=100 Bytes=10600) 8 7 INDEX (RANGE SCAN) OF 'NU_T3' (NON-UNIQUE) (Cost =1 Card=100) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 15342 consistent gets 0 physical reads 0 redo size 479 bytes sent via SQL*Net to client 372 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 1 sorts (memory) 0 sorts (disk) 1 rows processed Followup December 16, 2005 - 1pm Central time zone: ... If it proves to be the case would you then suggest considering splitting the cache? ... no, not without more information. I would first be looking at decreasing the IO's needed on the other thing. so you are using a cluster and have it clustered by dimension1?
a
Bookmark | Bottom | Top
Reviewer:
Jack Douglas from Maidenhead, UK
Sorry Tom, Table tx create failed in my script above because I ran it twice and forgot to drop it. If you run it, it should work though... Perhaps I should also have mentioned that I agree with your logic about IOTs in general and that the PL/SQL part of the script simulates the way data is added to our particular database (1 large dataload per week, dimension3/id3 is a week number). I am not trying to generalise about IOTs versus heaps/clusters. Followup December 16, 2005 - 1pm Central time zone: no worries - the mentioning of the IOT was to achieve clustering - which maybe you are already doing but via a cluster right?
a
Bookmark | Bottom | Top
Reviewer:
Jack Douglas from Maidenhead, UK
Yes we use the cluster to the same effect as an IOT - just to keep certain rows together that we query together. We cluster on a function of the dimensions rather than on any one of them (actually id1, round (id2 / 50), id3) because we query both id1, id3 and id1, id2 and this is a kind of halfway house between the two. Clustering on either one would be great for one side of the union in the query but disastrous for the other - hence the relatively huge IO on the third query. But basically, yes, as you say, we do it for the same reason we would use an IOT. Unfortunately what you gain with an IOT is lost when you add secondary indexes in our case due to the primary key duplication.
a
Bookmark | Bottom | Top
Reviewer:
David Aldridge http://oraclesponge.blogspot.com from Colorado Springs
>> this reminds me sort of "separate indexes from tables" in a way ;) Might be more reasonable to place "large segments you plan on multi-block IO'ing together and separate them from segments you plan o single-block IO'ing" at the volume level in order to prevent read ahead from kicking in. Reason I say this is - it was observed that separating indexes from data helped a system once. turned out it was the separation of single block IO'ed objects from multi-block IO'ed objects - and it was the suppression of "magic read ahead algorithms" on the single block IO'ed things that made the difference. << Ah, that's an interesting point on the index/table setup ... yes, if there is a way to segregate the segments according to whether we want read ahead applied or not then that would do the trick. Much depends on how read ahead works on a particular system though I'd think.
a
Bookmark | Bottom | Top
Reviewer:
Yoav
Hi Tom, In 8.1.7.4 when defining multiple buffer pool its needed to specify two attributes for each buffer pool: * The number of buffers in the buffer pool * The number of lru latches allocated to the buffer pool for example : BUFFER_POOL_KEEP=(BUFFERS:10000, LRU_LATCHES:2) Could you please explain how to calculate the size of "BUFFERS" and "LRE_LATCHES" ? Regards. Followup May 4, 2006 - 1am Central time zone: are you really realy really sure you need to use this (I mean, if you don't know how big it is - that should be something you "know" since you are going to use this to tune a very specific problem with?)
a
Bookmark | Bottom | Top
Reviewer:
Yoav
Hi Tom, Thanks for your input. We have 3rd party application,and we are suffering from heavy logical i/o activity. (e.g:db file sequential read show 80% in thae last statspack on a unix with 8 cpus). For now we cant touch the application code . So i read in expert-one-on-one page 80 : "...a buffer pool ,large enough to our 'lookup' tables in memory. for example, when oracle read a blocks from this table,they always get cached in this special pool. ... A buffer that is set up to cache blocks like this is known as KEEP pool..." And also read : http://asktom.oracle.com/pls/ask/f?p=4950:8:13335700705400115196::NO::F4950_P8_DISPLAYID,F4950_P8_CR ITERIA:6265095774206 and thought that increasing the buffer cache and using multiple pools ,m a y b e, help to reduce this the logical i/o. 1. So back to my last question, i think that "BUFFER:" represent the SUM(BYTE) from dba_tables + SUM(BYTE) from dba_indexes . Is it right ? 2. I actualy dont know how much "LRE_LATCHES:" to assign for the KEEP pool and for the RECYCLE pool. I hope you can advice about that. Regards. Followup May 5, 2006 - 1am Central time zone: .... and we are suffering from heavy logical i/o activity. (e.g:db file sequential read show 80% in thae last statspack on a unix with 8 cpus). ......... that would not be heavy logical i/o perhaps - that would be PHYSICAL IO. and 80% - that is just a ratio, 80% of WHAT? I'm not going to touch the buffer pool here - you haven't identified as far as I can tell what the problem is, what objects you might slide into these pools and how that might help you.
a
Bookmark | Bottom | Top
Reviewer:
Yoav
Hi Tom, I based my estimation on the statspack. since this thread is not about the statspack i didnt want to add any copy/paste proof from the statspack. Hir some more information: 1. I wrote heavy logical i/o based on the logical reads per second (2054 per second) which is about 7 time more then the physical reads (300 per second) 2. From the "Top 5 Wait Events" 80% of the total waits event was due to "db file sequential read" (I know that you want like it, but for fairness I'll say that the statspack run for about 40 hours - but in this case its just help me to show that i have a consistent problem with "db file sequential read" wait event). 3. The best thing to do, was to change the appliaction code. I know that touching the buffer cache want change the fact that 1 of 4 transaction still will ended with rollback. But ss i said Since its not possible ,at this point in time to change the appliaction code, dont you think that increasing the buffer cache and using multiple pools can help ? Cache Sizes ~~~~~~~~~~~ db_block_buffers: 524288 log_buffer:262144 db_block_size: 4096 shared_pool_size:89128960 Load Profile ~~~~~~~~~~~~ Per Second Per Transaction --------------- --------------- Redo size: 129,876.68 29,747.96 Logical reads: 2,054.39 470.55 Block changes: 949.55 217.49 Physical reads: 299.29 68.55 Physical writes: 72.85 16.69 User calls: 329.95 75.58 Parses: 37.97 8.70 Hard parses: 0.10 0.02 Sorts: 9,597.31 2,198.24 Logons: 0.07 0.02 Executes: 195.04 44.67 Transactions: 4.37 % Blocks changed per Read: 46.22 Recursive Call %: 16.25 Rollback per transaction %: 24.00 Rows per Sort: 0.03 Top 5 Wait Events ~~~~~~~~~~~~~~~~~ Wait % Total Event Waits Time (cs) Wt Time -------------------------------------------- ------------ ------------ ------- db file sequential read 15,204,906 6,856,997 79.03 log file parallel write 1,085,325 1,070,571 12.34 ------------------------------------------------------------- Thanks. Followup May 5, 2006 - 6am Central time zone: so, how did that tell you what segments would benefit from their own buffer pool. That is my point, how the heck do you know what you want to put into this buffer pool. 40 hours, bah - ignore this statspack, you cannot do anything with it. multiple buffer pools ain't going to help - unless and until you know the segments that would truly benefit from it - and I don't think you do.
a
Bookmark | Bottom | Top
Reviewer:
Guy Lambregts from Belgium
I think to know what alter table t1 cache did - in release 7.x and 8.x - this with repect to MRU and LRU if the blocks of table t1 where accessed in a scattered way (FTS). Also I think to know from release 8.1.5 onwards Oracle has introduced the touch count algorithm. With my own words : A block is read into the buffer cache where it is inserted at a midpoint, somwhere at the "border" of the cold and the warm region. The more a block is touched the more its touch count increases (x$bh.tch) the lesser the likelyhood the block will be "thrown" out of the cache. I know there are a set of hidden parameters, which I never ... touch. Now my question : Is the alter table t1 cache still relevant with respect to this touch count philosophy ? I mean is there still any difference between alter table t1 cache and alter table t1 nocache ? Kind Regards Guy Followup March 17, 2007 - 3pm Central time zone: you should use alter table T cache since the introduction of the keep and recycle pools really.but the effect is the same, a table with the cache attribute is considered a short table during a full scan regardless of its real size.
a
Bookmark | Bottom | Top
Reviewer:
A reader
[1] Does it worth to keep table a in keep pool. [2] e.g for a table which is not in keep pool, any dml updates data buffer cache and then write the data on disk at checkpoint.What happens when we keep a table in keep pool. How does inserts, deletes and updates work on this table a. Followup March 26, 2007 - 7am Central time zone: 1) yes, no, maybe.2) just like they would if you did not use a keep pool - the keep/recycle pools are just like the buffer cache itself, they are buffer caches.
a
Bookmark | Bottom | Top
Reviewer:
Sal from NJ
OLTP .. 10.2 version. We have identified 4 tables that are the center of all activities in the database. Update, insert, delete, selects.. Some DBA's are suggesting to put them in keep pool. They tested it on a production copy and the bottlenck operations started working in seconds instead of 5+ minutes. But in their test they were the only one in the database.. The tables are 3 GB in size.. Our shared pool is 8 gb, they are saying lets increase it to 16 gb and put these tables worth 3 gb on keep pool. Is there any limit to what can be cached? I have cached smaller lookup tables in past, but not sure about this here. Any suggestions Sir? thanks Followup February 21, 2008 - 5pm Central time zone: only the limits of machine addressability.
a
Bookmark | Bottom | Top
Reviewer:
Gareth Adcock from England
Hello I think that I may have a candidate for the 'alter table cache' command. The database concerned (version 9.2.0.3.0) has a buffer cache of 2400 Mb. I have a table which I think 'should' be cached most of the time but would appear to have been aged out most of the times that the application has read from it, resulting in a large ammount of physical reads. The table I am looking at the most is at the bottom of the following list in terms of size (F554215). select segment_name,segment_type, ((blocks*8)/1024)/1024 GB from dba_segments where (owner,segment_name) in (select owner,object_name from v$segment_statistics where statistic_name = 'physical reads' and value > 100000000) order by blocks desc SEGMENT_NAME SEGMENT_TYPE GB ------------------------------------------------------------- F42199 TABLE 15.902557373046875 F42199 TABLE 13.0977630615234375 F4111 TABLE 10.4477691650390625 F0911 TABLE 10.242462158203125 F42119 TABLE 1.8017578125 F0902 TABLE .7207183837890625 F4611 TABLE .4254913330078125 F4108 TABLE .34760284423828125 F4211 TABLE .11937713623046875 F554215 TABLE .0668182373046875 It is a 'hot' table rather than a 'keep warm' table. It gets read a lot; the potential problem is that it gets read a lot form disk. select owner,object_name,value from v$segment_statistics where statistic_name = 'physical reads' and value > 100000000 (hundred million) order by value desc OWNER OBJECT_NAME VALUE ---------------------------------------------- PRODDTA F0911 2480884853 PRODDTA F4111 1733224651 PRODDTA F554215 1549742899 PRODDTA F42119 982973397 PRODDTA F42199 908307993 PRODDTA F0902 296390653 CRPDTA F42199 187409137 PRODDTA F4108 181621627 PRODDTA F4211 148565012 PRODDTA F4611 118151835 So a relatively small table comes third in total physical reads. select owner,object_name,value from v$segment_statistics where statistic_name = 'logical reads' and value > 1000000000 (thousand million) order by value desc OWNER OBJECT_NAME VALUE ---------------------------------------------------- PRODDTA F0911 4087505552 SYS I_OBJ1 2717363136 PRODDTA F4111 1909729328 PRODDTA F554215 1840557360 PRODDTA F4108_0 1626329632 PRODDTA F554108_7 1328939200 PRODDTA F41021_PK 1310094416 PRODDTA F59FS800 1146349296 PRODDTA CI_F0911_FIN_CUBE 1006994896 So about 84% of the time the data gets read from disk. I looked at the sql statements pertaining to this table,which were currently stored in memory. SQL Analyzer showed up the likely suspect and I queried v$sql select fetches, executions, first_load_time,last_load_time from v$sql where sql_text = 'SELECT * FROM PRODDTA.F554215 WHERE ( XHMCU = :KEY1 ) ' FETCHES EXECUTIONS FIRST_LOAD_TIME LAST_LOAD_TIME ---------------------------------------------------------------------------------------------------- ------------ 5587 5587 2008-01-14/06:14:02 2008-02-25/06:28:48 SELECT count(*) FROM PRODDTA.F554215 group by xhmcu COUNT(*) -------------- 175002 2 1 I'm guessing it's always going to do a full table scan. This table gets updated a lot and I understand that this makes it unsuitable for a bitmap index, besides it's going to want to bring back almost all of the blocks almost all of the time. A function based index may be an option for the rare values but these are almost never queried. I ran the sql statement from my SQL Developer session. Explian Plan: OPERATION OPTIMIZER COST CARDINALITY BYTES SELECT STATEMENT CHOOSE 1907 174200 61144200 TABLE ACCESS(FULL) PRODDTA.F554215 ANALYZED 1907 174200 61144200 Autotrace: recursive calls 0 db block gets 0 consistent gets 8856 physical reads 8730 redo size 0 bytes sent via SQL*Net to client 1613 bytes received via SQL*Net from client 647 SQL*Net roundtrips to/from client 5 sorts (memory) 2 sorts (disk) 0 In my novice opinion it would seem that this table could benefit from 'alter table cache' to slow down the aging out process. If this is not a practical or a good idea would setting up a seperate keep buffer cache be worth looking into? My first instinct is to trust the algorithms that Oracle uses to allocate resources but this one is puzzling me. Thanks Gareth Followup February 28, 2008 - 11pm Central time zone: SELECT count(*) FROM PRODDTA.F554215group by xhmcu I would erase that query in the application, that would be my fix. I'll bet it is some silly logic like: select the count if the count > 0 then do something I would just code: do something period, end, you never need to count the rows. Besides, look at that query, the output is USELESS SELECT count(*) FROM PRODDTA.F554215 group by xhmcu a count grouped by a column that is not selected, so you get a bunch of random numbers in some arbitrary order representing something. what a waste.
a
Bookmark | Bottom | Top
Reviewer:
Gareth from England
The actual query I am looking at is SELECT * FROM PRODDTA.F554215 WHERE ( XHMCU = :KEY1 ) Sorry, it's tucked away in: select fetches, executions, first_load_time,last_load_time from v$sql where sql_text = 'SELECT * FROM PRODDTA.F554215 WHERE ( XHMCU = :KEY1 ) ' I should have referenced it explicitly, my fault. The count(*) was just to demonstrate the limited number of possible values in xhmcu and skewed nature of the data. apologies Gareth Followup March 1, 2008 - 10am Central time zone: ok, now for a super silly question.why do you need to full scan this over and over (I'm going to keep going back to the application, where 99.999% of all tuning needs be done in every case). Why would you full scan this table *so often*, what is the logic there. Does it really make sense to do this (putting in ram isn't going to fix very much, if you are using regular OS files, it is already probably buffered in the OS file system cache). don't forget - the v$ tables are since *forever* (since instance started, those IO's are cumulative) Now, that said - looking at the dates on that - that is like two weeks worth of executes. Why do you need to full scan this table 400 times a day?
a
Bookmark | Bottom | Top
Reviewer:
Gareth from England
Thanks for looking at this. The short answer is that I don't know. This is third party application that bolts on to our main JDEdwards database. I don't have any direct involvement in development other than to raise concerns after the event. I am currently highlighting issues concerning the use of literals rather than bind variables in the code resulting in hundreds of almost identical sql statements in the shared pool. I will add this to my list of issues. Thanks again for taking the time to look at this. Gareth Followup March 3, 2008 - 7am Central time zone: you could try setting up a keep pool (a couple times larger than the size of the table, especially if it is modified/updated)...alter the table to be in the keep pool alter the table 'cache' that should reduce the physical IO to minimal - and if that is the cause of a problem (the IO's were), that problem would be reduced...
a
Bookmark | Bottom | Top
Reviewer:
Ramani from USA
Hi tom Thanks for the information regarding keeping / deleting tables in buffer_pool. alter table mytable storage (buffer_pool keep); alter table mytable storage (buffer_pool default); Is there a way to check what tables are kept in buffer_pool ? from which dictionary view. I do not think I get the table names that are kept in buffer_pool from V$BUFFER_POOL. Thanks in advance, Ramani
a
Bookmark | Bottom | Top
Reviewer:
A reader
SQL> SQL> SELECT * FROM v$version; BANNER -------------------------------------------------------------------------------- Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production PL/SQL Release 11.1.0.6.0 - Production CORE 11.1.0.6.0 Production TNS for 32-bit Windows: Version 11.1.0.6.0 - Production NLSRTL Version 11.1.0.6.0 - Production Elapsed: 00:00:00.01 SQL> SQL> show parameter sga NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ lock_sga boolean FALSE pre_page_sga boolean FALSE sga_max_size big integer 1104M sga_target big integer 1104M SQL> show parameter pga NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ pga_aggregate_target big integer 400M SQL> SQL> CREATE TABLE m AS 2 SELECT rownum mID, 3 MOD(rownum, 100) cnt1, 4 MOD(rownum, 200) cnt2, 5 TRUNC(dbms_random.value(1, 10000)) cnt3, 6 RPAD('x', 30, 'x') char1, 7 RPAD('x', 30, 'y') char2, 8 RPAD('x', 300, 'z') char3 9 FROM dual 10 CONNECT BY level <= 1000000; Table created. Elapsed: 00:00:16.75 SQL> SQL> CREATE TABLE i AS 2 SELECT rownum iID, 3 mID, 4 MOD(rownum, 2) cnt1, 5 MOD(rownum, 4) cnt2, 6 RPAD('x', 30, 'x') char1 7 FROM m 8 UNION ALL 9 SELECT rownum + 1000001 iID, 10 mID, 11 MOD(rownum, 2) cnt1, 12 MOD(rownum, 4) cnt2, 13 RPAD('x', 30, 'x') char1 14 FROM m; Table created. Elapsed: 00:00:17.15 SQL> SQL> SELECT segment_Name, bytes/1024/1024 MB 2 FROM user_segments 3 WHERE segment_Name IN ('I', 'M'); SEGMENT_NAME MB ------------------------------ ---------- I 119 M 439 Elapsed: 00:00:00.01 SQL> SQL> SQL> ALTER TABLE m ADD CONSTRAINT m_pk PRIMARY KEY (mID); Table altered. Elapsed: 00:00:08.54 SQL> SQL> ALTER TABLE i ADD CONSTRAINT i_fk1 FOREIGN KEY (mID) 2 REFERENCES m(mID); Table altered. Elapsed: 00:00:04.82 SQL> SQL> CREATE INDEX i_fk1 ON i(mID); Index created. Elapsed: 00:00:02.54 SQL> SQL> SQL> BEGIN 2 DBMS_STATS.GATHER_TABLE_STATS( 3 ownname => user, 4 tabname => 'M', 5 estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE, 6 method_opt => 'FOR ALL COLUMNS SIZE SKEWONLY', 7 cascade => TRUE); 8 9 DBMS_STATS.GATHER_TABLE_STATS( 10 ownname => user, 11 tabname => 'I', 12 estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE, 13 method_opt => 'FOR ALL COLUMNS SIZE SKEWONLY', 14 cascade => TRUE); 15 END; 16 / PL/SQL procedure successfully completed. Elapsed: 00:00:21.51 SQL> SQL> SQL> set autotrace on SQL> SQL> SELECT COUNT(*) 2 FROM m, i 3 WHERE m.mID = i.mID AND 4 i.cnt1 = 1 AND 5 m.char1 = 'xxxxxxxxxxxxxxxxxxxxxxxx Why are there so many physical IO's even after I marked the tables for caching in memory? If these were the only two tables in the database, is there anyway I can completely eliminate physical IO's giving that 1 GB is set aside for SGA and the 2 tables added up to be around 550 MB? Followup July 14, 2009 - 1pm Central time zone: well, you say the SGA is 1gb, but the SGA is made of many things. How big is your BUFFER Cache and remember - lots of other stuff has to fit in there.and what was the physical IO to - tkprof with a row source operation would be infinitely better than this - this has no detail.
a
Bookmark | Bottom | Top
Reviewer:
A reader
Reader, Since you're on 11g, you may want to look into using the result cache feature. Make sure the appropriate result cache init parameters are sized for the amount of data you want to cache. Rerun your example and add the hint on your selects. The first select will still do all the IOs, subsequent executions of the same select should be extremely fast because the result is cached. Give it a try, it rocks!
a
Bookmark | Bottom | Top
Reviewer:
A reader
Here's the tkprof with row source operation. It looks like the physical IO's are all from full scanning table m. STAT #3 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=69914 pr=55556 pw=55556 time=0 us)' STAT #3 id=2 cnt=1000000 pid=1 pos=1 obj=0 op='HASH JOIN (cr=69914 pr=55556 pw=55556 time=266516 us cost=18706 size=44818048 card=1018592)' STAT #3 id=3 cnt=1000000 pid=2 pos=1 obj=22718 op='TABLE ACCESS FULL I (cr=14350 pr=0 pw=0 time=3888 us cost=3199 size=8148736 card=1018592)' STAT #3 id=4 cnt=1000000 pid=2 pos=2 obj=22717 op='TABLE ACCESS FULL M (cr=55564 pr=55556 pw=55556 time=245890 us cost=12257 size=35996760 card=999910)' SQL> show parameter db_cache_size NAME TYPE VALUE ------------------------------------ ----------- ----- db_cache_size big integer 712M SQL> show parameter sga NAME TYPE VALUE ------------------------------------ ----------- ----- lock_sga boolean FALSE pre_page_sga boolean FALSE sga_max_size big integer 1104M sga_target big integer 1104M I tried setting db_cache_size to 0 so Oracle would automatically allocate memory. However, the result was the same. Followup July 15, 2009 - 11am Central time zone: I see that pr=55556I see also that pw (physical write) is...... 55556. How about you do the trace with wait events being recorded - I'll bet the waits are "direct read temp" and "direct write temp" Eg: table M is hashed but spills to disk.
a
Bookmark | Bottom | Top
Reviewer:
A reader
Thanks for the tip. However, that's not the point of the question. I was trying to get a better understanding of the behavior of caching a table. Followup July 15, 2009 - 11am Central time zone: I think the table is probably cached, but you are spilling to disk on a hash operation. Given that pr = pw anyway.
a
Bookmark | Bottom | Top
Reviewer:
A reader
All waits are direct path reads. WAIT #3: nam='direct path read' ela= 2 file number=8 first dba=2623184 block cnt=16 obj#=22717 tim=5157349182372 WAIT #3: nam='direct path read' ela= 4 file number=8 first dba=2623200 block cnt=16 obj#=22717 tim=5157349183354 WAIT #3: nam='direct path read' ela= 2 file number=8 first dba=2623216 block cnt=16 obj#=22717 tim=5157349184092 ... ... ... WAIT #3: nam='direct path read' ela= 2 file number=8 first dba=2562560 block cnt=16 obj#=22717 tim=5157351288487 WAIT #3: nam='direct path read' ela= 4 file number=8 first dba=2562576 block cnt=16 obj#=22717 tim=5157351289457 WAIT #3: nam='direct path read' ela= 2 file number=8 first dba=2562592 block cnt=16 obj#=22717 tim=5157351290425 ******************************************************************************** SELECT COUNT(*) FROM m, i WHERE m.mID = i.mID AND i.cnt1 = 1 AND m.char1 = 'xxxxxxxxxxxxxxxxxxxxxxxx Followup July 15, 2009 - 12pm Central time zone: oh, that explains it, it isn't using a conventional path read at all - it is bypassing the buffer cache. So the alter cache is not useful at all in this case.In 11g - we read via direct path over conventional path (buffer cache) based on statistics and setup. Direct path reads make sure the latest version of the block is on disk and then just reads away. strange that the direct path read incremented the pw, that isn't right.
a
Bookmark | Bottom | Top
Reviewer:
A reader
Can you elaborate on this statement "In 11g - we read via direct path over conventional path (buffer cache) based on statistics and setup.", specifically the setup part. Also why would it do direct path reads on one table but not the other? Followup July 15, 2009 - 2pm Central time zone: the table sizes are different. Based on your object and system statistics, the optimizer opted for a direct path read.Normally, a full scan in SERIAL mode would use a conventional path read. It would use your db file multiblock read count (should be set automatically by us in 10g and above to do the MAX multiblock read available on your system) to figure out how many blocks to read at a time. Sounds all good - all efficient - however.... Say we decided to read 32 blocks at a time. Further assume block 1, 10, 15, 20, 22 are in the cache already. We cannot use the image on disk for these guys so the IO looks like: logical IO block 1. multiblock read 2-9 into the cache, logical IO them out. logical IO block 10 multiblock read 11-14 into the cache, logical IO them out. logical IO block 15 multiblock read 16-19 into the cache, logical IO them out. logical IO block 20 multiblock (well, single block really) read 21 into the cache, logical IO it out logical IO block 22 multiblock read 23-32 into the cache, logical IO them out. Using a direct read it would be a) before running query, checkpoint the segment ensuring current version of data is on disk. b) read blocks 1-32 into PGA and process them. The size of the table (your one is bigger than the other) the multiblock read count (actual observed on the system - not the parameter) will influence whether we direct path or conventional path read the segment.
a
Bookmark | Bottom | Top
Reviewer:
A reader
I ran the exact same test case on 10.2.0.4. Now it looks like neither tables are being cached?! SQL> show parameter sga NAME TYPE VALUE ------------------------------------ ----------- ----- lock_sga boolean FALSE pre_page_sga boolean FALSE sga_max_size big integer 1104M sga_target big integer 1104M SQL> show parameter db_cache_size NAME TYPE VALUE ------------------------------------ ----------- ----- db_cache_size big integer 0 Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production With the Partitioning, OLAP, Data Mining and Real Application Testing options PARSING IN CURSOR #48 len=127 dep=0 uid=47 oct=3 lid=47 tim=3302881257 hv=1790720527 ad='4c634a74' SELECT COUNT(*) FROM m, i WHERE m.mID = i.mID AND i.cnt1 = 1 AND m.char1 = 'xxxxxxxxxxxxxxxxxxxxxxxx Followup July 15, 2009 - 3pm Central time zone: probably the cache is just insufficient. By the time we read the end of the table we are finding we have to overwrite the "head" of the table. So when you go for the second run, the head of the table isn't there anymore, we read that in and overwrite more of the table - which we have to read and so on.Not sure why this is surprising? You have well over 1/2gb of data, your SGA is just 1gb. There are many other 'pools' in there. and unless you are the sole user on this system - there is a ton of other stuff going on.
a
Bookmark | Bottom | Top
Reviewer:
A reader
SQL> SELECT buffer_pool, COUNT(*) 2 FROM dba_tables 3 GROUP BY buffer_pool; BUFFER_ COUNT(*) ------- ---------- 153 DEFAULT 1655 SQL> show parameter pool NAME TYPE VALUE ------------------------------------ ----------- -------- buffer_pool_keep string buffer_pool_recycle string global_context_pool_size string java_pool_size big integer 0 large_pool_size big integer 0 olap_page_pool_size big integer 0 shared_pool_reserved_size big integer 30198988 shared_pool_size big integer 0 streams_pool_size big integer 0 Followup July 15, 2009 - 4pm Central time zone: SQL> show sgalet us see how much is currently allocated to the buffer pool.
a
Bookmark | Bottom | Top
Reviewer:
A reader
SQL> show sga Total System Global Area 1157627904 bytes Fixed Size 1298016 bytes Variable Size 218104224 bytes Database Buffers 931135488 bytes Redo Buffers 7090176 bytes Followup July 16, 2009 - 11am Central time zone: let me see the output of:ops$tkyte%ORA10GR2> select name || '=' || value from v$parameter where isdefault = 'FALSE'; NAME||'='||VALUE ------------------------------------------------------------------------------- processes=150 sessions=300 sga_max_size=1157627904 sga_target=1157627904 control_files=/home/ora10gr2/oradata/ora10gr2/control01.ctl, /home/ora10gr2/ora data/ora10gr2/control02.ctl, /home/ora10gr2/oradata/ora10gr2/control03.ctl db_block_size=8192 compatible=10.2.0.1.0 db_create_file_dest=/home/ora10gr2/oradata/ora10gr2 db_recovery_file_dest=/home/ora10gr2/oradata/fbra db_recovery_file_dest_size=10737418240 undo_management=AUTO undo_tablespace=UNDOTBS undo_retention=5000 db_domain= dispatchers=(protocol=tcp) job_queue_processes=10 db_name=ora10gr2 open_cursors=300 os_authent_prefix=OPS$ pga_aggregate_target=419430400 20 rows selected. and I'll see if I can reproduce - as it is not, I cannot. It would seem you have sufficient buffer cache.
a
Bookmark | Bottom | Top
Reviewer:
A reader
SQL> select name || '=' || value from v$parameter where isdefault = 'FALSE' 2 order by name 3 / NAME||'='||VALUE ---------------------------------------------------------------------------------------------------- ------------------------------- aq_tm_processes=0 background_dump_dest=C:\ORACLE\PRODUCT\10.2.0\ADMIN\TEST_DB\BDUMP compatible=10.2.0.4.0 control_files=C:\ORACLE\PRODUCT\10.2.0\ORADATA\TEST_DB\CONTROL01.CTL, C:\ORACLE\PRODUCT\10.2.0\ORADATA\TEST_DB\CONTROL02.CTL, C:\OR ACLE\PRODUCT\10.2.0\ORADATA\TEST_DB\CONTROL03.CTL core_dump_dest=C:\ORACLE\PRODUCT\10.2.0\ADMIN\TEST_DB\CDUMP db_block_size=8192 db_cache_size=209715200 db_create_file_dest=C:\ORACLE\PRODUCT\10.2.0\ORADATA\TEST_DB db_domain= db_file_multiblock_read_count=16 db_name=TEST_DB disk_asynch_io=FALSE fast_start_mttr_target=0 java_pool_size=0 job_queue_processes=200 large_pool_size=0 nls_length_semantics=BYTE open_cursors=300 optimizer_index_caching=90 optimizer_index_cost_adj=25 pga_aggregate_target=419430400 processes=1000 query_rewrite_enabled=FALSE remote_login_passwordfile=EXCLUSIVE resource_manager_plan= session_max_open_files=20 sessions=1500 sga_max_size=1157627904 sga_target=1157627904 shared_pool_size=0 sort_area_size=524288 star_transformation_enabled=FALSE streams_pool_size=0 timed_statistics=TRUE undo_management=AUTO undo_retention=900 undo_tablespace=UNDOTBS1 user_dump_dest=C:\ORACLE\PRODUCT\10.2.0\ADMIN\TEST_DB\UDUMP 38 rows selected. Followup July 16, 2009 - 4pm Central time zone: well, I disagree with the setting of most of your parameters - however on linux, I cannot reproduce your apparent findings. Now, that said, you differ from me in that the automatic SGA resizing has kicked in on your system - the buffer cache was made larger on your system.That said, maybe someone else with a windows play system can test this out for us ops$tkyte%ORA10GR2> set linesize 1000 ops$tkyte%ORA10GR2> ops$tkyte%ORA10GR2> drop table m; Table dropped. ops$tkyte%ORA10GR2> drop table i; Table dropped. ops$tkyte%ORA10GR2> ops$tkyte%ORA10GR2> CREATE TABLE m AS 2 SELECT rownum mID, 3 MOD(rownum, 100) cnt1, 4 MOD(rownum, 200) cnt2, 5 TRUNC(dbms_random.value(1, 10000)) cnt3, 6 RPAD('x', 30, 'x') char1, 7 RPAD('x', 30, 'y') char2, 8 RPAD('x', 300, 'z') char3 9 FROM dual 10 CONNECT BY level <= 1000000; Table created. ops$tkyte%ORA10GR2> ops$tkyte%ORA10GR2> CREATE TABLE i AS 2 SELECT rownum iID, 3 mID, 4 MOD(rownum, 2) cnt1, 5 MOD(rownum, 4) cnt2, 6 RPAD('x', 30, 'x') char1 7 FROM m 8 UNION ALL 9 SELECT rownum + 1000001 iID, 10 mID, 11 MOD(rownum, 2) cnt1, 12 MOD(rownum, 4) cnt2, 13 RPAD('x', 30, 'x') char1 14 FROM m; Table created. ops$tkyte%ORA10GR2> SELECT segment_Name, bytes/1024/1024 MB 2 FROM user_segments 3 WHERE segment_Name IN ('I', 'M'); SEGMENT_NAME MB ------------------------------ ---------- I 120 M 438 ops$tkyte%ORA10GR2> ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'M' ); PL/SQL procedure successfully completed. ops$tkyte%ORA10GR2> exec dbms_stats.gather_table_stats( user, 'I' ); PL/SQL procedure successfully completed. ops$tkyte%ORA10GR2> ops$tkyte%ORA10GR2> select name || '=' || value from v$parameter where isdefault = 'FALSE' order by name; NAME||'='||VALUE ---------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------- aq_tm_processes=0 compatible=10.2.0.4.0 control_files=/home/ora10gr2/oradata/ora10gr2/control01.ctl, /home/ora10gr2/oradata/ora10gr2/control02.ctl, /home/ora10gr2/oradata/ora10gr2/control03.ctl db_block_size=8192 db_cache_size=218103808 db_create_file_dest=/home/ora10gr2/oradata/ora10gr2 db_domain= db_file_multiblock_read_count=16 db_name=ora10gr2 db_recovery_file_dest=/home/ora10gr2/oradata/fbra db_recovery_file_dest_size=10737418240 disk_asynch_io=FALSE dispatchers=(protocol=tcp) fast_start_mttr_target=0 java_pool_size=0 job_queue_processes=200 large_pool_size=0 nls_length_semantics=BYTE open_cursors=300 optimizer_index_caching=90 optimizer_index_cost_adj=25 os_authent_prefix=OPS$ pga_aggregate_target=419430400 processes=1000 query_rewrite_enabled=FALSE remote_login_passwordfile=EXCLUSIVE session_max_open_files=20 sessions=1500 sga_max_size=1157627904 sga_target=1157627904 shared_pool_size=0 sort_area_size=524288 star_transformation_enabled=FALSE streams_pool_size=0 timed_statistics=TRUE undo_management=AUTO undo_retention=5000 undo_tablespace=UNDOTBS 38 rows selected. ops$tkyte%ORA10GR2> ops$tkyte%ORA10GR2> alter system flush buffer_cache; System altered. ops$tkyte%ORA10GR2> alter table m cache; Table altered. ops$tkyte%ORA10GR2> alter table i cache; Table altered. ops$tkyte%ORA10GR2> ops$tkyte%ORA10GR2> set autotrace on ops$tkyte%ORA10GR2> SELECT COUNT(*) 2 FROM m, i 3 WHERE m.mID = i.mID AND 4 i.cnt1 = 1 AND m.char1 = 'xxxxxxxxxxxxxxxxxxxxxxxx
a
Bookmark | Bottom | Top
Reviewer:
A reader
Can you give me a few examples and the reason? Thanks. Followup July 24, 2009 - 7am Central time zone: because they are set - you shouldn't be setting very many, if any, non-default parameters beyond what MUST be set.disabling features, setting pools to zero
a
Bookmark | Bottom | Top
Reviewer:
Dheeraj from India
Tom, Referring to the cache discussion in ther very first question. I understand that the alter table ...cache, only effects the ageing of the table. My question is,once we do the alter table...cache does it apply to the entire life of the table or is it session specific and goes off once the session is exit Thanks, Dheeraj Followup July 24, 2009 - 7am Central time zone: it is a table attribute that is in place until you change it. It survives database restarts and all.
a
Bookmark | Bottom | Top
Reviewer:
Oleksandr Alesinskyy from Germany
You alter TABLE, not session. this alteration (as was already pointed by Tom) is written into data dictionary - that means that it is permanent and not session-specific. Moreover, this option may be specified by table creation.
a
Bookmark | Bottom | Top
Reviewer:
A reader
Tom, The physical IO's have all disappered if there were no constraints and indexes on the tables. However, they come back as soon as constraints and indexes are introduced in the test. These indexes/constraints were in the originally test case (ALTER TABLE CACHE July 9, 2009 - 1am US/Eastern), but you have removed them from your test case. Followup July 24, 2009 - 8am Central time zone: give me the example, please don't make me attempt to try to reconstruct the example you think I should run...
a
Bookmark | Bottom | Top
Reviewer:
Srinath from Dayton, OH
SQL*Plus: Release 11.1.0.6.0 - Production on Thu Jul 23 11:42:11 2009 Copyright (c) 1982, 2007, Oracle. All rights reserved. Enter user-name: scott/******** Connected to: Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production With the Partitioning, OLAP, Data Mining and Real Application Testing options SQL> CREATE TABLE m AS 2 SELECT rownum mID, 3 MOD(rownum, 100) cnt1, 4 MOD(rownum, 200) cnt2, 5 TRUNC(dbms_random.value(1, 10000)) cnt3, 6 RPAD('x', 30, 'x') char1, 7 RPAD('x', 30, 'y') char2, 8 RPAD('x', 300, 'z') char3 9 FROM dual 10 CONNECT BY level <= 1000000; Table created. SQL> SQL> CREATE TABLE i AS 2 SELECT rownum iID, 3 mID, 4 MOD(rownum, 2) cnt1, 5 MOD(rownum, 4) cnt2, 6 RPAD('x', 30, 'x') char1 7 FROM m 8 UNION ALL 9 SELECT rownum + 1000001 iID, 10 mID, 11 MOD(rownum, 2) cnt1, 12 MOD(rownum, 4) cnt2, 13 RPAD('x', 30, 'x') char1 14 FROM m; Table created. SQL> set linesize 1000 SQL> SELECT segment_Name, bytes/1024/1024 MB 2 FROM user_segments 3 WHERE segment_Name IN ('I', 'M'); SEGMENT_NAME MB --------------------------------------------------------------------------------- ---------- I 120 M 440 SQL> exec dbms_stats.gather_table_stats( user, 'M' ); PL/SQL procedure successfully completed. SQL> SQL> exec dbms_stats.gather_table_stats( user, 'I' ); PL/SQL procedure successfully completed. SQL*Plus: Release 11.1.0.6.0 - Production on Thu Jul 23 11:59:06 2009 Copyright (c) 1982, 2007, Oracle. All rights reserved. Enter user-name: scott/tiger Connected to: Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production With the Partitioning, OLAP, Data Mining and Real Application Testing options SQL> set line 1000 SQL> select name || '=' || value from v$parameter where isdefault = 'FALSE' orde r 2 by name; NAME||'='||VALUE -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- ---------------------------------------- audit_file_dest=C:\APP\******\ADMIN\ORA11G\ADUMP audit_trail=DB compatible=11.1.0.0.0 control_files=C:\APP\******\ORADATA\ORA11G\CONTROL01.CTL, C:\APP\******\ORADATA\ORA11G \CONTROL02.CTL, C:\APP\******\ORADATA\ORA11G\CONTROL03.CTL db_block_size=8192 db_domain= db_name=ora11g db_recovery_file_dest=C:\app\******\flash_recovery_area db_recovery_file_dest_size=2147483648 diagnostic_dest=C:\APP\****** dispatchers=(PROTOCOL=TCP) (SERVICE=ora11gXDB) NAME||'='||VALUE -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- ---------------------------------------- memory_target=1719664640 open_cursors=300 processes=150 remote_login_passwordfile=EXCLUSIVE undo_tablespace=UNDOTBS1 16 rows selected. SQL> alter system flush buffer_cache; System altered. SQL> alter table m cache; Table altered. SQL> SQL> alter table i cache; Table altered. SQL> set autotrace on SQL> SELECT COUNT(*) 2 FROM m, i 3 WHERE m.mID = i.mID AND 4 i.cnt1 = 1 AND m.char1 = 'xxxxxxxxxxxxxxxxxxxxxxxxCan you please shed some light on these stats in Oracle 11g please ? (autotrace output of the second Sql call) I would not expect any physical reads at all, but I cant believe what I'm seeing. Followup July 26, 2009 - 7am Central time zone: tkprof, undoubtedly it decided for a direct path read, avoid entirely t he huge overhead of the buffer cache.
a
Bookmark | Bottom | Top
Reviewer:
Srinath from Dayton, OH
Tom, Why is there a change in the stats between 11g and 10g versions ? Is it like a new feature in 11g database ? Thanks. Followup July 26, 2009 - 9pm Central time zone: anyway - we went over this a lot above - I'm not sure why this is a surprise. We beat this to death actually. You just ran an example we provided, and that example was provided to show that what you are seeing would actually happen.... But yes, you should always expect version X to work differently in some respects from version Y - else version X would be version Y wouldn't it.
a
Bookmark | Bottom | Top
Reviewer:
A reader
Maybe I can rephrase Srinath's question regarding the change of behavior between 10g and 11g because I also have a similar question. This direct path reads or whatever new data access path that 11g is employing to retrieve data - why should it be a step down in terms of performance? If you ran the SQL in 10g and assuming the tables are fully cached, it will always be faster than running the same SQL in 11g because 11g is doing direct path reads. In other words, if I upgrade my database to 11g, the application will suffer significant performance degradation. So the question is how do we get around this problem? Thanks. Followup July 27, 2009 - 7pm Central time zone: ... t will always be faster than runningthe same SQL in 11g because 11g is doing direct path reads. ... bzzzt - false. did you read where I said above: ... undoubtedly it decided for a direct path read, avoid entirely the huge overhead of the buffer cache. .... parallel query has been doing this for years and years (since about 1994). A full scan reading from the buffer cache is horribly painful - we have to read a block at a time, not multiblocks - and it takes hundreds, if not thousands of cpu cycles/instructions to get a single block and dozens of latches (locks, serialization devices) You are making an assumption (cache must always be faster than disk) that is not true. prove to us that reading 1,000,000 blocks from the buffer cache on a normal system is "always going to be faster" than getting from disk.
a
Bookmark | Bottom | Top
Reviewer:
A reader
Tom, earlier in this thread, you mentioned that we shouldn't be setting too many initialization parameters - e.g. setting pools to zero. I was under the impression that if automatic memory management is used, by setting pools to zero, Oracle would automatically figure out over time what's the best settings for each of the pools based on the usage of the application. Is this not correct understanding? Can you point me to a thread on how automatic memory management should be used properly? Followup July 27, 2009 - 8pm Central time zone: You need not set anything other then the parameter to control how much memory to use - SGA_TARGET (just automatic sga memory management) or MEMORY_TARGET (in 11g, auto tune the pga AND sga together).You need not set the individual pools
a
Bookmark | Bottom | Top
Reviewer:
Nadya from Russia
... "KGH: NO ACCESS " 1046641408 ... We have to set the pools manual.
a
Bookmark | Bottom | Top
Reviewer:
fakdaddy from MO
Curious about the comment "Bear in mind tho... there is really no true way to a have a purely "in memory table -- even cached tables are subject to aging from the buffer cache. " So for exaggeration purposes - I have a 1GB KEEP cache, and I assign 1x 100MB table to it - the blocks would be aged out ?? I have never configured a keep cache before as I figure oracle is cleverer than myself to mange. However we have a critical batch app (more critical than its day use) .... so we are considering a KEPP pool to cach high read/low size tables so they don't get aged out during the 12HOUR OLTP/DAY window. Followup September 27, 2010 - 9am Central time zone: your keep cache is the same as your recycle cache is the same as the default cache as far as holding blocks in memory go. The keep cache is not a special "keep this in memory" cache, it manages blocks the same way as the other caches. It is just a 'naming convention', you put things you want to 'keep' in the nicely sized keep pool and things you don't in the smaller sized recycle pool.The goal of the keep pool would be to make it have very few objects - just the things you would like to have in the cache as much as possible - so they are not competing with lots of other stuff for space.
a
Bookmark | Bottom | Top
Reviewer:
A reader
I was reading the internals of touch count algorithm. I have a question. In the document it was told that, "Oracle only allows buffer’s touch count to be incremented, at most, once every 3 seconds. When a touch count is incremented buffer pointer should move. But movement of buffer pointer is independent of touch count increment. Also for any activity in memory area oracle needs a latch for assuring cache consistency. But there is an exception here !! For updating touch count, Oracle does not use latch and buffer block can be modified while touch count is getting incremented. But more interesting is that, two processes may increment the touch count to same value, and when this happens Oracle assures the worst that could happen is the touch count is not actually incremented every time a buffer is touched and that no cache corruption will result." Therefore, 1) Does it mean at every 3 seconds the touch count is increased only once for each buffer? If not why? Also, I couldn't understand how the latch process increments the touch count? \ 2) How it is independently working with every 3 seconds process of incrementation of touch count to solve the problems? 3) When was Touch count algorithm introduced? Does it overwrites all the properties of Modified LRU? Followup October 19, 2011 - 6pm Central time zone: 1) first the latch processing. A latch is another name for a lock, a serialization device. In general, if I have a latch on something - you cannot get it (some latches can be shared, but generally for read only access to some memory segment).Since we are not using a latch to protect this counter - you can and will have the condition whereby two (or more) processes attempt to increase the counter at the same time. Suppose you and I both read the same block at about the same time. We both want to update the touch count. Suppose the existing touch count was 42 - we would both see 42, and then set the count to 43. It really should have been "44" since we both read it - but it won't be. But this is OK - we haven't 'corrupted' anything and we still have a 'pretty good' count. a touch count for a block is only incremented after about 3 seconds. 2) I don't know what you are trying to ask there. 3) it *is* a modified LRU type of algorithm. It was introduced back in the Oracle 8i days.
a
Bookmark | Bottom | Top
Reviewer:
A reader
a
Bookmark | Bottom | Top
Reviewer:
A reader
hi Tom, I have five-six tables having row count between 1000 to 25000. Is it advisable to keep the tables of such sizes in memory through CACHE? What is the ideal size for a table with 20 columns with average column size as 20 bytes. The SGA is 3GB What are the drawbacks of the CACHE aaproach? Please advise. Followup February 21, 2012 - 7pm Central time zone: no, do not use cache, if you use them - they'll be cached.... What is the ideal size for a table with 20 columns with average column size as 20 bytes. ... you are missing an important metric ;) the number of rows. alter table t cache does not cause a table to be cached. It changes the way the blocks are managed in the cache when the blocks are read during a full scan only. If you are using index access, it will have no affect.
a
Bookmark | Bottom | Top
Reviewer:
Ian Wallace from Dublin
I've read a number of comments with interest in this thread, especially around the use of the buffer cache and parallel queries. In the thread about you state the following: "Do the blocks need to be kept cached? should they be cached? are you using parallel query (more common with large full table scans) - then the cache doesn't really matter (direct io)." "... undoubtedly it decided for a direct path read, avoid entirely the huge overhead of the buffer cache. ...." And finally.... "parallel query has been doing this for years and years (since about 1994). A full scan reading from the buffer cache is horribly painful - we have to read a block at a time, not multiblocks - and it takes hundreds, if not thousands of cpu cycles/instructions to get a single block and dozens of latches (locks, serialization devices)" This topic is very pertinent to us at the moment. We're designing a reporting solution whereby we have created one denormalized table which is composite partitioned RANGE - HASH, the theory being that with partition elimination we will always full scan a sub partition. We have a 4 node RAC cluster and we're hoping to take advantage of parallelism. We're currently designing this solution with some external consultants who have suggested defining this table in the KEEP pool. The size of the KEEP pool would be circa 10GB whereas the size of our SGA is currently 30GB. Based on what you've said above would we actually make this of this pool, especially if parallel query is using direct I/O? Also could you provide a simple example showing how the use of parallel query negates the use of the buffer pool? The database version is 10.2.0.2 on Red Hat Linux 4. Many thanks, Ian. Followup May 30, 2012 - 7am Central time zone: ask them "why", since full scans in general will flush the buffer cache and do direct IO from disk?Just run a parallel query and tell me if you see logical IO's or not ;) In 11g, there is a new in memory parallel query http://docs.oracle.com/cd/E11882_01/server.112/e16638/memory.htm#i30761 but that doesn't apply to you. and even then, I wouldn't be using the KEEP pool
a
Bookmark | Bottom | Top
Reviewer:
Alexander
Did you mean "will NOT flush the buffer cache..."? Or what the CACHE table setting somewhere implied that I missed? Followup May 30, 2012 - 11pm Central time zone: No, I meant it would do a segment level checkpoint - to get the blocks for the segment you are scanning onto disk so it can bypass the inefficiencies of the buffer cache and just do direct IO from disk.if they didn't flush those blocks to disk - they might read the wrong version (there could be a committed version in the cache that is not on disk). I should have said "parallel full scans" for 10g, but in 11g - just saying "full scans" is sufficient
a
Bookmark | Bottom | Top
Reviewer:
Ian Wallace from Dublin
Could you clarify what you mean when you say full scans in general will flush the buffer cache and do direct path I/O from disk. My understanding is that parallel query in 10g will just bypass the buffer cache and always do direct I/O? Thanks, Ian. Followup May 31, 2012 - 12am Central time zone: see right above, I clarified that.
a
Bookmark | Bottom | Top
Reviewer:
A reader
Hi tom, How is ORACLE managing 'keep' pool itself? Let's see i use 'storage' clause to put many tables into 'keep' pool, whose total size can not be accormordated by 'keep' pool size, which one will be 'out'? Is it following the same logic that full table scan one will be the first candidate? Followup June 1, 2012 - 6am Central time zone: the keep and recycle pool are identical in nature of the "buffer pool"by default segments are cached in the buffer pool, the default pool You can alter/create a segment and specify to use the keep or recycle pools instead. Once they are marked for those pools they will be cached in those pools instead of the default pool - but the way they are cached are identical to the default pool. I wish the keep pool was named "non-default pool1" and the recycle pool was named "non-default pool2". In your question: Let's see i use 'storage' clause to put many tables into 'keep' pool, whose total size can not be accormordated by 'keep' pool size, which one will be 'out'? Is it following the same logic that full table scan one will be the first candidate? just replace keep with 'default' (or recycle even) and the answer is the same. when the X pool fills up, dbwr is called upon to make space in it using LRU like algorithms. |