今天接到一用户的邮件,说是使用ST_Geometry SQL 查询 效率很低
select Count(*) from ghh t where sde.st_within(t.shape, sde.st_buffer( sde.st_multipoint('multipoint(490021.7775 303342.81825, 489497.07 303822.42475,489250.1075 303778.5025)', t.shape.srid), 300)) = 1;
刚开始在我机器上运行 竟然给搞蓝屏了,可能是我这几天没有关机的原因吧,不过重启机器运行竟然运行了两分钟
根据上面的SQL语句我们可以看到GHH是一个点层,共有12980个点,是将几个多点缓冲300 与点层做相交,得到相交个数
参考运行结果
SQL> select Count(*) from ghh t where sde.st_within(t.shape, sde.st_buffer( sde.st_multipoint('multipoint(49 03342.81825, 489497.07 303822.42475,489250.1075 303778.5025)', t.shape.srid), 300)) = 1; 已用时间: 00: 02: 35.34 执行计划 ---------------------------------------------------------- Plan hash value: 223515074 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 2311 | 69 (2)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 2311 | | | |* 2 | TABLE ACCESS FULL| GHH | 144 | 324K| 69 (2)| 00:00:01 | --------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("SDE"."ST_WITHIN"("T"."SHAPE","SDE"."ST_BUFFER"("ST_MULTIP OINT"."ST_MULTIPOINT"('multipoint(490021.7775 303342.81825, 489497.07 303822.42475,489250.1075 303778.5025)',"T"."SYS_NC00016$"),300))=1) Note ----- - dynamic sampling used for this statement 统计信息 ---------------------------------------------------------- 108342 recursive calls 2284480 db block gets 339330 consistent gets 53 physical reads 0 redo size 344 bytes sent via SQL*Net to client 360 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 1 sorts (memory) 0 sorts (disk) 1 rows processed
可以看到走的全表扫描,我是使用了走索引的操作符(ST_WITHIN) 但是怎么又走全表扫描了,在同事LiuF的帮助下发现一个小细节
t.shape.srid,是不是这个有问题,我们将t.shape.srid换成相应的SRID值
已用时间: 00: 00: 00.10 执行计划 ---------------------------------------------------------- Plan hash value: 4255557756 ---------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 2311 | 3 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 2311 | | | |* 2 | DOMAIN INDEX (Sel: Default - No Stats)| A361_IX1 | | | 18E (0)| | ---------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("SDE"."ST_WITHIN"("T"."SHAPE","SDE"."ST_BUFFER"("ST_MULTIPOINT"."ST_MULTIPOINT "('multipoint(490021.7775 303342.81825, 489497.07 303822.42475,489250.1075 303778.5025)',48),300))=1) Note ----- - dynamic sampling used for this statement
差距好大啊,而且走索引了
我们可以看一下
SQL> select t.shape.srid from ghh t; 已选择12980行。 已用时间: 00: 00: 01.18 执行计划 ---------------------------------------------------------- Plan hash value: 4166285587 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 14357 | 182K| 68 (0)| 00:00:01 | | 1 | TABLE ACCESS FULL| GHH | 14357 | 182K| 68 (0)| 00:00:01 | -------------------------------------------------------------------------- Note ----- - dynamic sampling used for this statement 统计信息 ---------------------------------------------------------- 4 recursive calls 0 db block gets 1275 consistent gets 0 physical reads 0 redo size 139650 bytes sent via SQL*Net to client 9875 bytes received via SQL*Net from client 867 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 12980 rows processed
原因原来是这样啊,使用t.shape.srid就会全表扫描每一个要素的获得其Srid,这样无疑降低了查询效率
可以见一个小小的细节对效率的影响竟然这么巨大