Performance Tuning[ Gain from Project ] -- WWPRT

Experience from WWPRT

1. via the results on an uncorrelated inner query to reduce the joint scale

Scenario,

Matched product -- product id not the same, but offeringname, geography and variantname is the same.

Now suppose that we have more than 1,000,000 products in DB.

 

And now we need to find out all matched products

I. The bad performance one, which will take more than 11mins to complete

select ov1.id, ov2.id, ov1.offeringname, vm1.variantname, vm1.geo
       from wwprt.offering_variant ov1 
            inner join wwprt.variant v1 on v1.id = ov1.variantid and v1.hasmap = 'Y'
            inner join wwprt.variant_map vm1 on  vm1.variantid  = v1.id

            inner join wwprt.offering_variant ov2 on ov2.offeringname = ov1.offeringname and ov2.id <> ov1.id      
            inner join wwprt.variant v2 on v2.id = ov2.variantid  and v2.hasmap = 'Y'
            inner join wwprt.variant_map vm2 on vm2.variantid = v2.id and vm2.variantname = vm1.variantname and vm2.geo = vm1.geo 
             
      order by ov1.offeringname, vm1.variantname, vm1.geo, ov1.id, ov2.id 
WITH UR	  

Performance analysis,

a. ov1 will go through entire products

b. ov2 will go through entire products

c. entire products joined with entire products, then from the joint results, to find out our matched products.

Key: entire products joined with entire products, we will got huge number of the joint results.....

II. The good performance one, it only elpased 35 secs around

Idea, The implementation tries to quickly qualify offering variants via the results on an uncorrelated inner query

select 
	ov1.id as ov1_id, ov2.id as ov2_id, ov1.offeringname, vm1.variantname, vm1.geo
from 
	wwprt.offering_variant ov1 
	inner join wwprt.variant v1 
		on v1.id = ov1.variantid 
		and v1.hasmap = 'Y'
	inner join wwprt.variant_map vm1 
		on  vm1.variantid  = v1.id
	inner join wwprt.offering_variant ov2
		on ov2.offeringname = ov1.offeringname
		and ov2.id <> ov1.id 
	inner join wwprt.variant v2 
		on v2.id = ov2.variantid  
		and v2.hasmap = 'Y'
	inner join wwprt.variant_map vm2 
		on vm2.variantid = v2.id
		and vm2.variantname = vm1.variantname
		and vm2.geo = vm1.geo
	where (ov1.offeringname, vm1.variantname, vm1.geo) in (
		select 
			ov3.offeringname, vm3.variantname, vm3.geo
		from 
			wwprt.offering_variant ov3 
			inner join wwprt.variant v3 
				on v3.id = ov3.variantid 
				and v3.hasmap = 'Y'
			inner join wwprt.variant_map vm3 
				on  vm3.variantid  = v3.id
		group by 
			ov3.offeringname, vm3.variantname, vm3.geo
		having 
			count(*) > 1)order by 
	ov1.offeringname, vm1.variantname, vm1.geo, ov1_id, ov2_id
with UR;

Performance Analysis

a. the scale of ov1 has been  reduced, the ov1 only go through the results from the inner sub sql

select 
			ov3.offeringname, vm3.variantname, vm3.geo
		from 
			wwprt.offering_variant ov3 
			inner join wwprt.variant v3 
				on v3.id = ov3.variantid 
				and v3.hasmap = 'Y'
			inner join wwprt.variant_map vm3 
				on  vm3.variantid  = v3.id
		group by 
			ov3.offeringname, vm3.variantname, vm3.geo
		having 
			count(*) > 1

b. ov2 will go through all prouduts

c. A very small part of products joined with entired products, then  we will got a very small joint results to process

2. Using left join to replace of using not exists()

Reason:

Tim: Use of 'NOT EXISTS' is discouraged.  It is usually several times faster to implement with an outer join.  The NOT EXISTS sub query is executed at a row level so that, internal to the DB, it is issuing this query once for each row to be evaluated.  

I. The SQL with not exists()

Explanation - before insert into PRODUCT_CTRY_JOIN_CN, first we need to avoid Unique Constraint error if the data already exists.

  insert into WWPRT.PRODUCT_CTRY_JOIN_CN (PRODUCTID, COUNTRY, ANNDOCNO, DELETED, SYSOWNER ) 
        select D.PROCESSINGPRODID, prodcty.country, prodcty.anndocno, prodcty.deleted, prodcty.sysowner 
               from wwprt.product_ctry_join_cn prodcty 
                    inner join DUPLICATED_PRODUCT_MSG D on prodcty.productid = D.PRODUCTID and prodcty.country = D.country
               where 
               // Avoid the unique constraint error if prodid and country already existed
               not exists( 
                     select 1 from wwprt.product_ctry_join_cn iprodcty where iprodcty.productid = D.PROCESSINGPRODID and iprodcty.country = D.country                                                                              
               )

II.The SQL replaced with Left Join()

  insert into WWPRT.PRODUCT_CTRY_JOIN_CN (PRODUCTID, COUNTRY, ANNDOCNO, DELETED, SYSOWNER ) 
          select D.PROCESSINGPRODID, prodcty1.productid, prodcty1.country, prodcty1.anndocno, prodcty1.deleted, prodcty1.sysowner
                       from wwprt.product_ctry_join_cn prodcty1
                       inner join DUPLICATED_PRODUCT_MSG D on prodcty1.productid = D.PRODUCTID and prodcty1.country = D.country
                      // Avoid the unique constraint error if prodid and country already existed
                       left join wwprt.product_ctry_join_cn prodcty2 on prodcty2.productid = D.PROCESSINGPRODID and prodcty2.country = D.country
                  where prodcty2.productid is null and prodcty2.country is null

 

 

 

你可能感兴趣的:(sql,performance,Go,idea)