low_value / high

If you take a look at view user_tab_columns (or equivalent) you will find that its definition includes columns low_value and high_value, which give the currently known low and high value for the data. For example:

 
            select  
           
            column_name, low_value, high_value  
           
            from  
           
            user_tab_columns  
           
            where  
           
            table_name  = 'T1'  
           
            and column_name = 'SKEW'  
           
            ;          
           
            COLUMN_NAME          LOW_VALUE       HIGH_VALUE  
           
            -------------------- --------------- ---------------  
           
            SKEW                 C102            C151

This is fine, of course, if you are good at translating internal forms of data into human-readable forms.

For those of us less gifted, the dbms_stats package gives us some help with a set of procedures with names like convert_raw_value(). There are three overloaded functions to handle varchar2,date, and number types (with binary_float and binary_double appearing as new options in 10g), and two special cases: convert_raw_value_nvarchar() and convert_raw_value_rowid(). The procedures all take two parameters – the first in an input which is the raw value to be converted, the second is an output which has to be a variable of the correct type, so, for example:

 
            SQL> declare  
           
            2     m_n     number;  
           
            3  begin  
           
            4     dbms_stats.convert_raw_value('C102',m_n);  
           
            5     dbms_output.put_line('Result: ' || m_n);  
           
            6  end;  
           
            7  .  
           
            SQL> /  
           
            Result: 1

For convenience, you can wrap calls to the procedures inside functions, which then makes them available for use in SQL.

The following example is just to get you started (the column called skew is one I know to be anumber type) – you would probably want to create a package to hold several different functions, and declare the functions as deterministic.

 
            create or replace function raw_to_num(i_raw raw)  
           
            return number  
           
            as  
           
            m_n number;  
           
            begin  
           
            dbms_stats.convert_raw_value(i_raw,m_n);  
           
            return m_n;  
           
            end;  
           
            /      
           
            select  
           
            column_name,  
           
            raw_to_num(low_value)       low_val,  
           
            raw_to_num(high_value)      high_val  
           
            from  
           
            user_tab_columns  
           
            where  
           
            table_name  = 'T1'  
           
            and column_name = 'SKEW'  
           
            ;      
           
            COLUMN_NAME             LOW_VAL   HIGH_VAL  
           
            -------------------- ---------- ----------  
           
            SKEW                          1         80

To finish off, you would probably want to query the user_tab_columns (or related) views with adecode() on the data_type to decide which of the functions to use, and then perform an explicitto_char() on the result of the decode(). But that task is left as an exercise for the reader.

Addendum: If you read the comments you will see that some of them have taken up the exercise and supplied some answers.

Fake HistogramsIn "CBO"

Extended statsIn "12c"

Virtual columns revisitedIn "Indexing"

Comments (21)

21 Comments »

 
              create or replace function raw_to_num(i_raw raw) 
             
              return number 
             
              as 
             
              m_n number; 
             
              begin 
             
              dbms_stats.convert_raw_value(i_raw,m_n); 
             
              return m_n; 
             
              end; 
             
              /   
             
              create or replace function raw_to_date(i_raw raw) 
             
              return date 
             
              as 
             
              m_n date; 
             
              begin 
             
              dbms_stats.convert_raw_value(i_raw,m_n); 
             
              return m_n; 
             
              end; 
             
              /   
             
              create or replace function raw_to_varchar2(i_raw raw) 
             
              return varchar2 
             
              as 
             
              m_n varchar2(20); 
             
              begin 
             
              dbms_stats.convert_raw_value(i_raw,m_n); 
             
              return m_n; 
             
              end; 
             
              /  
             
              select  
             
              column_name, 
             
              decode(data_type, 
             
              'VARCHAR2',to_char(raw_to_varchar2(low_value)), 
             
              'DATE',to_char(raw_to_date(low_value)), 
             
              'NUMBER',to_char(raw_to_num(low_value)) 
             
              ) low_value, 
             
              decode(data_type, 
             
              'VARCHAR2',to_char(raw_to_varchar2(high_value)), 
             
              'DATE',to_char(raw_to_date(high_value)), 
             
              'NUMBER',to_char(raw_to_num(high_value)) 
             
              ) high_value 
             
              from user_tab_columns  
             
              where table_name='';

Exercise done.- reader

Comment by anonymous — November 29, 2006 @ 8:02 pm BST Nov 29,2006 | Reply

Seems funny that Oracle should put these procedures in the DBMS_STATS package when the UTL_RAW package has a bunch of casting functions for RAW values (CAST_TO_BINARY_DOUBLE, CAST_TO_BINARY_FLOAT, CAST_TO_BINARY_INTEGER, CAST_TO_NUMBER, CAST_TO_NVARCHAR2, CAST_TO_VARCHAR2). Note the absence of a CAST_TO_DATE function.

Cheers

Tim…

Comment by Tim Hall — November 30, 2006 @ 8:01 am BST Nov 30,2006 | Reply

We have found out dependence of LOW_VALUE, HIGH_VALUE from parameters NLS_COMP, NLS_SORT. However DBMS_STATS documentation is not present the description of such behaviour…

Comment by Pavel Kozlov — June 7, 2007 @ 8:10 am BST Jun 7,2007 | Reply

I am attempting to change the statistics value of high_value with the help of dbms_stats.set_column_stats. For some reason I struggle to find a function to convert date to raw. Am I looking in the wrong place or am I missing something?

Comment by Jan Rod — August 24, 2009 @ 7:13 pm BST Aug 24,2009 | Reply

Jan,

you don’t have to convert DATE to RAW when setting low/high value. Just let the dbms_stats.prepare_column_stats do the dirty work for you. I hope there are no bug in my EXAMPLE :)

 
                create or replace function raw_to_date(i_raw raw) 
               
                return date 
               
                as 
               
                m_n date; 
               
                begin 
               
                dbms_stats.convert_raw_value(i_raw,m_n); 
               
                return m_n; 
               
                end; 
               
                /  
               
                ALTER SESSION SET NLS_DATE_FORMAT='YYYY.MM.DD'; 
               
                CREATE TABLE t_d1 (d_date DATE); 
               
                INSERT INTO t_d1 VALUES(TO_DATE('2009.08.01', 'YYYY.MM.DD')); 
               
                INSERT INTO t_d1 VALUES(TO_DATE('2009.08.31', 'YYYY.MM.DD')); 
               
                COMMIT; 
               
                --gather_stats 
               
                BEGIN 
               
                DBMS_STATS.gather_table_stats (user, 
               
                'T_D1', 
               
                estimate_percent => NULL, 
               
                method_opt => 'FOR ALL COLUMNS SIZE 1', 
               
                CASCADE => TRUE, 
               
                FORCE => TRUE 
               
                ); 
               
                END; 
               
                / 
               
                --show low/high val 
               
                SELECT low_value, raw_to_date(low_value), high_value, raw_to_date(high_value) 
               
                FROM user_tab_col_statistics 
               
                WHERE table_name = 'T_D1' AND column_name = 'D_DATE' 
               
                / 
               
                /*  !!!RESULT!!! 
               
                LOW_VALUE            RAW_TO_DAT HIGH_VALUE           RAW_TO_DAT 
               
                -------------------- ---------- -------------------- ---------- 
               
                786D0801010101       2009.08.01 786D081F010101       2009.08.31 
               
                */ 
               
                --increase high value by ONE MONTH 
               
                DECLARE 
               
                n_distcnt          NUMBER; 
               
                n_density          NUMBER; 
               
                n_nullcnt          NUMBER; 
               
                rec_srec           DBMS_STATS.statrec; 
               
                datevals           DBMS_STATS.DATEARRAY; 
               
                n_avgclen          NUMBER; 
               
                d_low              DATE; 
               
                d_high             DATE; 
               
                BEGIN 
               
                DBMS_STATS.get_column_stats (USER, 
               
                'T_D1', 
               
                'D_DATE', 
               
                distcnt      => n_distcnt, 
               
                density      => n_density, 
               
                nullcnt      => n_nullcnt, 
               
                srec         => rec_srec, 
               
                avgclen      => n_avgclen 
               
                ); 
               
                --handles just the case when HISTOGRAMS are not set -> rec_srec.epc = 2 
               
                IF rec_srec.epc = 2 THEN 
               
                SELECT raw_to_date(low_value), raw_to_date(high_value) INTO d_low, d_high 
               
                FROM user_tab_col_statistics 
               
                WHERE table_name = 'T_D1' AND column_name = 'D_DATE'; 
               
                d_high := ADD_MONTHS(d_high, 1); 
               
                datevals := DBMS_STATS.DATEARRAY(d_low, d_high); 
               
                rec_srec.minval:=NULL; 
               
                rec_srec.maxval:=NULL; 
               
                rec_srec.bkvals:=NULL; 
               
                rec_srec.novals:=NULL; 
               
                --this procedure will set epc.minval, epc.maxval etc in INTERNAL FORMAT 
               
                DBMS_STATS.PREPARE_COLUMN_VALUES(rec_srec, datevals); 
               
                --and then just set statistics 
               
                DBMS_STATS.set_column_stats (USER, 
               
                'T_D1', 
               
                'D_DATE', 
               
                distcnt      => n_distcnt, 
               
                density      => n_density, 
               
                nullcnt      => n_nullcnt, 
               
                srec         => rec_srec, 
               
                avgclen      => n_avgclen 
               
                ); 
               
                END IF;     
               
                END; 
               
                / 
               
                --show low/high val , HIGH_VAL IS increased 
               
                SELECT low_value, raw_to_date(low_value), high_value, raw_to_date(high_value) 
               
                FROM user_tab_col_statistics 
               
                WHERE table_name = 'T_D1' AND column_name = 'D_DATE' 
               
                / 
               
                /*  !!!RESULT!!! 
               
                LOW_VALUE            RAW_TO_DAT HIGH_VALUE           RAW_TO_DAT 
               
                -------------------- ---------- -------------------- ---------- 
               
                786D0801010101       2009.08.01 786D091E010101       2009.09.30 
               
                */

Comment by Pavol Babel — August 24, 2009 @ 8:42 pm BST Aug 24,2009 | Reply

Can you please tel me what > is used for and how ? what do i enter when i get prompt ? thanks

Comment by Jim — January 20, 2012 @ 8:47 pm BST Jan 20,2012 | Reply
- This is a side effect of wordpress and it’s handling of the “greater than” sign. In some cases it comes out as “>”. I’ve fixed it in the example above.
  
  Comment by Jonathan Lewis — January 21, 2012 @ 5:04 pm BST Jan 21,2012
- thanx jonathan for repairing my example. As far I can remember it had been written before wordpress started to support sourcecode tag (so formating was poor). You have obviously fixed it layter, but > remained :) Now it seems to be 100% correct.
  
  Comment by Pavol Babel — January 21, 2012 @ 10:57 pm BST Jan 21,2012

BTW, you could set the RAW value directly into srec.maxval . But you have to write your own RAW TO DATE function.

RAW representation of DATE in user_tab_columns follows the OCI 7BYTE representation of DATE. You can read the OCI Docmuentation to understand it.

 
                select d_date,  dump(d_date) dump_date FROM t_d1 
               
                / 
               
                D_DATE     DUMP_DATE 
               
                ---------- ------------------------------------ 
               
                2009.08.01 Typ=12 Len=7: 120,109,8,1,1,1,1 
               
                2009.08.31 Typ=12 Len=7: 120,109,8,31,1,1,1

If you convert each number to HEXA and concat them, you’ll get RAW from low/high_val from user_tab_columns/user_tab_col_statistics

Comment by Pavol Babel — August 24, 2009 @ 8:56 pm BST Aug 24,2009 | Reply

http://marogel.wordpress.com/2013/04/29/utl_raw-cast_to_date/
shows an implementation

Comment by Matthias Rogel — April 29, 2013 @ 2:02 pm BST Apr 29,2013 | Reply

I am having problem setting the high value. Does it have to do with histograms already being set ? How can i set it then, i f that is the case. My rec_srec.epic value is 253
Thanks

Comment by Jim — June 10, 2014 @ 9:46 pm BST Jun 10,2014 | Reply

Jim,

If you’re using the sample in an earlier comment to set the high value then the pre-existence of a histogram probably is your problem. Since your rec_srec.epc = 253 you can modify the code by:
a) remove the if .. end if check on rec_srec.epc
b) add a line after the “datevals :=” line setting rec_srec.epc:= 2;

The epc (end point count) is the number of elements in the array, if the value doesn’t match the actual array you get an error whose number I forget.

For further examples on using set_column_stats see:http://jonathanlewis.wordpress.com/2010/03/23/fake-histograms/

Comment by Jonathan Lewis — June 11, 2014 @ 10:14 am BST Jun 11,2014 | Reply

Hi Jonathan

thanks very much for getting back to me

i implemented the change but here is my question and what I am trying to accomplish
That worked great by the way

I generaly gather column stats on the “CreateDate” column of my “DTree” table to get the high value for CreateDate

begin
dbms_stats.gather_table_stats(
ownname=> ‘LLPROD’,
tabname=> ‘DTREE’ ,
estimate_percent=> DBMS_STATS.AUTO_SAMPLE_SIZE,
cascade=> DBMS_STATS.AUTO_CASCADE,
degree=> null,
no_invalidate=> DBMS_STATS.AUTO_INVALIDATE,
granularity=> ‘AUTO’,
method_opt=> ‘FOR COLUMNS CREATEDATE SIZE 254′);
end;
/

Checking, I get 254 buckets from user_tab_col_statistics

But what I find is that I do not usually get the Highest value for CreateDate. probably because of sampling. I am guessing

Using your script with the mods you just sent me
I am setting the High Value for CreateDate to SYSDATE for the table

I now see the buckets are getting reset to 1
and the CreateDate does have the SYSDATE value

My question is do I have to be concerned that the Buckets are now 1 ?
What happens when the stats are collected by Oracle Enterprise Manager (10g) ?

Does the High Value get re-valued ?

Appreciate Your Help

Jim

Comment by Jim — June 16, 2014 @ 8:08 pm BST Jun 16,2014 | Reply

Jim,

So what you’re after is code to replace the current highest value in the histogram with a value you select while leaving the histogram intact. (I’d be inclined to run my own code to generate the whole histogram, but that might be duplicating too much run-time resource usage if can’t change the code that’s already there to gather the histogram).

I don’t think I’ve tried this before – so subject to testing and I may blog it some time over the next couple of weeks – but I think you could include a loop in the pl/sql to populate the date array and the bucket array – then putting in your high value on the last date array entry. You probably need to consider variations for frequency and height-balanced histograms, and the query for the bucket array might require you to use an analytic function to find differences between endpoint values.

Something like:

 
                d_array     := dbms_stats.datearray(); 
               
                srec.bkvals := dbms_stats.numarray(); 
               
                for r in ( 
               
                select  ... 
               
                from    user_tab_histograms 
               
                where   table_name = ... and column_name = ... 
               
                order by endpoint_value 
               
                ) loop 
               
                d_array.extend; 
               
                srec.bkvals.extend; 
               
                ct := ct+1; 
               
                d_array(ct) := r.endpoint_value; 
               
                srec.bkvals(ct) := r.{bucket_count_value} 
               
                end loop; 
               
                d_array(ct) := sysdate;

Comment by Jonathan Lewis — June 17, 2014 @ 7:48 am BST Jun 17,2014 | Reply

Hi Jonathan

I may now be getting way over my head with this new code

I am just trying to ensure that the highest value of the CREATEDATE column for the table DTREE is todays date
after i collect stats on the column:

 
              execute dbms_stats.gather_table_stats(ownname=>'LLPROD',tabname=>'DTREE',estimate_percent=>NULL,METHOD_OPT=>'FOR COLUMNS CREATEDATE SIZE 254');

 
              i check this in user_tab_col_statistics and you can see the date is not the sysdate yet

 
              col high_value        format a16 
             
              col low_value         format a16 
             
              select  
             
              low_value,  
             
              raw_to_date(low_value) low_date,  
             
              high_value,  
             
              raw_to_date(high_value) high_date 
             
              from  
             
              user_tab_col_statistics 
             
              where  
             
              table_name='DTREE' 
             
              and  
             
              column_name='CREATEDATE'; 
             
              LOW_VALUE        LOW_DATE         HIGH_VALUE       HIGH_DATE 
             
              ---------------- ---------------- ---------------- --------------- 
             
              74650101010101   01-JAN-01        7872060D0B1029   13-JUN-14 
             
              TABLE_NAME   COLUMN_NAME  LAST_ANALYZED            NUM_BUCKETS NUM_DISTINCT 
             
              ------------ ------------ ------------------------ ----------- ------------ 
             
              DTREE        CREATEDATE   17-JUN-2014 09:57:29             254      2870018

If I run the previous code I get the sysdate value but I only now have 1 bucket instead of 254

 
              LOW_VALUE        LOW_DATE         HIGH_VALUE       HIGH_DATE 
             
              ---------------- ---------------- ---------------- ---------------- 
             
              74650101010101   01-JAN-01        787206110B3A0E   17-JUN-14 
             
              TABLE_NAME   COLUMN_NAME  LAST_ANALYZED            NUM_BUCKETS NUM_DISTINCT 
             
              ------------ ------------ ------------------------ ----------- ------------ 
             
              DTREE        CREATEDATE   17-JUN-2014 10:57:13               1      2870018

Trying to implement your code is a bit overwhelming for me
as I am not a pl/sql expert so I am getting errors tring to incorporate

I am not sure what you meant by thee lines

 
              select  ... 
             
              srec.bkvals(ct) := r.{bucket_count_value} 
             
              Here is what i coded so far and the errors 
             
              I know I am missing the boat 
             
              DECLARE 
             
              ct                       NUMBER; 
             
              srec                     DBMS_STATS.statrec; 
             
              d_array                  DBMS_STATS.DATEARRAY; 
             
              BEGIN 
             
              d_array     := dbms_stats.datearray(); 
             
              srec.bkvals := dbms_stats.numarray(); 
             
              for r in ( 
             
              select  
             
              endpoint_value, 
             
              endpoint_number 
             
              from  
             
              user_tab_histograms 
             
              where  
             
              table_name = 'DTREE'  
             
              and 
             
              column_name = 'CREATEDATE' 
             
              order by  
             
              endpoint_value 
             
              ) loop 
             
              d_array.extend; 
             
              srec.bkvals.extend; 
             
              ct := ct+1; 
             
              d_array(ct) := r.endpoint_value; 
             
              srec.bkvals(ct) := r.endpoint_number; 
             
              end loop; 
             
              d_array(ct) := sysdate; 
             
              END; 
             
              / 
             
              d_array(ct)     := r.endpoint_value; 
             
              * 
             
              ERROR at line 27: 
             
              ORA-06550: line 27, column 20: 
             
              PLS-00382: expression is of wrong type 
             
              ORA-06550: line 27, column 3: 
             
              PL/SQL: Statement ignored

Any assitance is appreciated
Jim

Comment by Jim — June 17, 2014 @ 7:03 pm BST Jun 17,2014 | Reply

Jim,

Sorry, I’d assumed you were proficient in PL/SQL.

I’ll see if I can find an odd half hour some time later this week to write up an example as a new blog item. It may have to wait until I’m on the plane to Seattle for Kaleidoscope, though.

Comment by Jonathan Lewis — June 18, 2014 @ 5:56 am BST Jun 18,2014 | Reply

JL,
I wish I was .. I am self taught and I learn as I go. Appreciate when folks like you help out

I wil keep tring to figure it out though

Jim

Comment by Jim — June 18, 2014 @ 10:52 pm BST Jun 18,2014 | Reply

Hi Jonathan

I found some code (not mine) I implemented to get past the error I was getting and I am pretty confident that I am getting the results I need, but I cannot seem to get figure out the next step which is updating that last record in the histogram table.

Here is what I came up with thanks to your input and Mr. Rajeshwaran athttp://rajeshwaranbtech.blogspot.com/2011/01/histograms-on-date-varchar-and-numeric.html

tested the code in a select statment::

 
                    select  
                   
                    endpoint_value, 
                   
                    TO_CHAR (endpoint_value) endpoint_value_char, 
                   
                    TO_DATE (FLOOR (endpoint_value) || '.' || TO_CHAR (86400 * MOD (endpoint_value,1), 'FM999999999'),'J,sssss') endpoint_value_date, 
                   
                    endpoint_number 
                   
                    from  
                   
                    user_tab_histograms 
                   
                    where  
                   
                    table_name = 'DTREE'  
                   
                    and 
                   
                    column_name = 'CREATEDATE' 
                   
                    order by  
                   
                    endpoint_value; 
                   
                    tried the pl/sql code out 
                   
                    DECLARE 
                   
                    ct                       NUMBER :=0; 
                   
                    srec                     DBMS_STATS.statrec; 
                   
                    d_array                  DBMS_STATS.DATEARRAY; 
                   
                    BEGIN 
                   
                    d_array     := dbms_stats.datearray(); 
                   
                    srec.bkvals := dbms_stats.numarray(); 
                   
                    for r in ( 
                   
                    select  
                   
                    endpoint_value, 
                   
                    endpoint_number 
                   
                    from  
                   
                    user_tab_histograms 
                   
                    where  
                   
                    table_name = 'DTREE'  
                   
                    and 
                   
                    column_name = 'CREATEDATE' 
                   
                    order by  
                   
                    endpoint_value 
                   
                    )  
                   
                    loop 
                   
                    d_array.extend; 
                   
                    srec.bkvals.extend; 
                   
                    ct := ct+1; 
                   
                    d_array(ct) := TO_DATE (FLOOR (r.endpoint_value) || '.' || TO_CHAR (86400 * MOD (r.endpoint_value,1), 'FM999999999'),'J,sssss'); 
                   
                    srec.bkvals(ct) := r.endpoint_number; 
                   
                    DBMS_OUTPUT.put_line ('ct(' || ct || ')   ' || 'srec.bkvals='  || srec.bkvals(ct) || '   ' ||  
                   
                    'r.endpoint_value=' || r.endpoint_value || '   ' || 'd_array(' || ct   ||')' || d_array(ct)  
                   
                    ); 
                   
                    end loop;     
                   
                    DBMS_OUTPUT.put_line ('Setting Last value to SYSDATE'); 
                   
                    d_array(ct) := sysdate; 
                   
                    DBMS_OUTPUT.put_line ('d_array(' || ct  || ')' || d_array(ct) ); 
                   
                    END; 
                   
                    / 
                   
                    Sample Output: 
                   
                    ct(1)   srec.bkvals=0   r.endpoint_value=2305814                    d_array(1)01-JAN-01 
                   
                    ct(2)   srec.bkvals=1   r.endpoint_value=2449118.58960648   d_array(2)10-MAY-93 
                   
                    ct(3)   srec.bkvals=2   r.endpoint_value=2449798.59928241   d_array(3)21-MAR-95 
                   
                    ct(4)   srec.bkvals=3   r.endpoint_value=2450102.125             d_array(4)19-JAN-96 
                   
                    ct(5)   srec.bkvals=4   r.endpoint_value=2450303.64722222   d_array(5)07-AUG-96 
                   
                    . 
                   
                    . 
                   
                    . 
                   
                    . 
                   
                    ct(250)   srec.bkvals=251   r.endpoint_value=2455020.63541667   d_array(250)07-JUL-09 
                   
                    ct(251)   srec.bkvals=252   r.endpoint_value=2455029.45696759   d_array(251)16-JUL-09 
                   
                    ct(252)   srec.bkvals=253   r.endpoint_value=2455035.32701389   d_array(252)22-JUL-09 
                   
                    ct(253)   srec.bkvals=254   r.endpoint_value=2456836.575625       d_array(253)27-JUN-14 
                   
                    Setting Last value to SYSDATE 
                   
                    d_array(253)02-JUL-14 
                   
                    Your advice and thoughts are appreciated 
                   
                    I will continue to see if I can find some more info on the internet to complete this

Thanks
jim

Comment by Jim — July 2, 2014 @ 8:18 pm BST Jul 2,2014

Jim,

I wrote up an example of how to do this a few days ago, but forgot to publish it. There are a couple of little traps in the method I’ve used (which is just a quick and dirty, so there may be a smarter method). I’ll copy the whole demo script to the blog in a couple of minutes.

Comment by Jonathan Lewis — July 4, 2014 @ 8:21 pm BST Jul 4,2014

[…] is a quick response to a question on an old blog post asking how you can adjust the high value if you’ve already got a height-balanced histogram […]

Pingback by Adjusting Histograms | Oracle Scratchpad — July 4, 2014 @ 8:32 pm BST Jul 4,2014 | Reply

Hi Jonathan

will take a look at the code you wrote
really appreciate your time in this matter

Thanks
Jim

Comment by Jim — July 7, 2014 @ 4:17 pm BST Jul 7,2014 | Reply

RSS feed for comments on this post. TrackBack URI

low_value / high_value

Related

21 Comments »

Leave a Reply

你可能感兴趣的:(oracle)