这是美国时间6月10日,拉里做的12c inmemory option的现场发布会。
如今,内存数据库被大家广泛认可,懂得技术的人都明白,数据从磁盘读写肯定比在内存中读写要慢很多,而且目前也有很多内存数据已经有非常成熟的实施经验,当然,当今数据库的老大Oracle更加不会无视这个市场,很早就渲染他们Oracle12c的内存组件多么的牛叉,快到不行更是他们经常使用的词汇。
在今年7月22日,Oracle终于发布了12.1.0.2版本,当然最关注的就是这个In-Memory组件的使用了。下载地址:http://www.oracle.com/technetwork/database/enterprise-edition/downloads/database12c-linux-download-2240591.html
在12c的In-Memory Option选件之中,数据在内存的独立区域中按照列式存储,数据是被压缩存放的,内存与列式压缩可以极大提升查询的性能,下图是IMO的示意图:
这是美国时间6月10日,拉里做的12c inmemory option的现场发布会。
------------------------------------------------------------------
版权所有,文章允许转载,但必须以链接方式注明源地址,否则追究法律责任!
建议看到转载,请直接访问正版链接获得最新的ArcGIS技术文章
Blog: http://blog.csdn.net/linghe301
------------------------------------------------------------------
测试环境1:VM虚拟机 、Linux 5.5、4GB内存、Oracle12.1.0.2 数据介绍:非空间数据
1:连接到sys用户下,查看内存初始化参数的值
[oracle@oracle12c ~]$ sqlplus / as sysdba SQL*Plus: Release 12.1.0.2.0 Production on Wed Jul 30 23:25:42 2014 Copyright (c) 1982, 2014, Oracle. All rights reserved. Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options SQL> show parameter inm NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ inmemory_clause_default string inmemory_force string DEFAULT inmemory_max_populate_servers integer 0 inmemory_query string ENABLE inmemory_size big integer 0 inmemory_trickle_repopulate_servers_ integer 1 percent optimizer_inmemory_aware boolean TRUE
2:默认情况下内存参数inmemory_size是没有值得,用户需要手动修改参数值。
SQL> alter system set inmemory_size=2G scope=spfile; System altered. SQL> alter system set inmemory_max_populate_servers=2 scope=spfile; System altered. SQL> shutdown immediate; Database closed. Database dismounted. ORACLE instance shut down. SQL> startup ORACLE instance started. Total System Global Area 4294967296 bytes Fixed Size 2932632 bytes Variable Size 603979880 bytes Database Buffers 1526726656 bytes Redo Buffers 13844480 bytes In-Memory Area 2147483648 bytes Database mounted. Database opened. SQL> show parameter inm NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ inmemory_clause_default string inmemory_force string DEFAULT inmemory_max_populate_servers integer 2 inmemory_query string ENABLE inmemory_size big integer 2G inmemory_trickle_repopulate_servers_ integer 1 percent optimizer_inmemory_aware boolean TRUE
------------------------------------------------------------------
版权所有,文章允许转载,但必须以链接方式注明源地址,否则追究法律责任!
建议看到转载,请直接访问正版链接获得最新的ArcGIS技术文章
Blog: http://blog.csdn.net/linghe301
------------------------------------------------------------------
3:创建一个普通表,然后进行一个普通测试
SQL> create table t1 as select * from dba_objects; Table created. SQL> select bytes/1024/1024 from user_segments where segment_name='T1'; BYTES/1024/1024 --------------- 12.5 SQL> set timing on SQL> set time on 01:53:00 SQL> set autot traceonly 01:53:07 SQL> select * from t1; 92177 rows selected. Elapsed: 00:00:03.51 Execution Plan ---------------------------------------------------------- Plan hash value: 3617692013 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 92177 | 10M| 429 (1)| 00:00:01 | | 1 | TABLE ACCESS FULL| T1 | 92177 | 10M| 429 (1)| 00:00:01 | -------------------------------------------------------------------------- Statistics ---------------------------------------------------------- 2 recursive calls 0 db block gets 7596 consistent gets 1546 physical reads 0 redo size 12280356 bytes sent via SQL*Net to client 68146 bytes received via SQL*Net from client 6147 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 92177 rows processed
我们从执行计划中可以看到,逻辑读7596个、物理读1546个。该表在数据库中占用空间约12.5MB。
4:创建同样的表在内存组件中进行测试
SQL> create table t2 as select * from dba_objects; Table created. SQL> set line 200 SQL> alter table t2 inmemory; Table altered. SQL> select * from v$inmemory_area; POOL ALLOC_BYTES USED_BYTES POPULATE_STATUS CON_ID -------------------------- ----------- ---------- -------------------------- ---------- 1MB POOL 1710227456 4194304 DONE 3 64KB POOL 419430400 51314688 DONE 3 SQL> select count(*) from t2; COUNT(*) ---------- 92178 SQL> select * from v$inmemory_area; POOL ALLOC_BYTES USED_BYTES POPULATE_STATUS CON_ID -------------------------- ----------- ---------- -------------------------- ---------- 1MB POOL 1710227456 8388608 DONE 3 64KB POOL 419430400 51445760 DONE 3 SQL> set timing on SQL> set time on 01:59:12 SQL> set autot traceonly 01:59:21 SQL> select * from t2; 92178 rows selected. Elapsed: 00:00:03.42 Execution Plan ---------------------------------------------------------- Plan hash value: 1513984157 ----------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 92178 | 10M| 31 (17)| 00:00:01 | | 1 | TABLE ACCESS INMEMORY FULL| T2 | 92178 | 10M| 31 (17)| 00:00:01 | ----------------------------------------------------------------------------------- Statistics ---------------------------------------------------------- 5 recursive calls 0 db block gets 9 consistent gets 0 physical reads 0 redo size 5016356 bytes sent via SQL*Net to client 68146 bytes received via SQL*Net from client 6147 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 92178 rows processed
------------------------------------------------------------------
版权所有,文章允许转载,但必须以链接方式注明源地址,否则追究法律责任!
建议看到转载,请直接访问正版链接获得最新的ArcGIS技术文章
Blog: http://blog.csdn.net/linghe301
------------------------------------------------------------------
5:结论
从上面可以看出,当将表至于inmemory状态时,该数据并没有在内存中,我们可以查看v$inmemory_area表里面信息,我们需要执行一些SQL语句如select count(*) from table将表读入内存中,这时候查看v$inmemory_area表中可以看到增长的信息,可能我原来进行过相关测试,1M内存pool和64K内存pool都有相关的信息,我们可以进行压缩对比来查看行式存储和列式存储的比较。
02:12:21 SQL> select (51445760 +8388608 -51314688 -4194304)/1024/1024 MB from dual; MB ---------- 4.125
可以进行对比,行式存储为12.5MB,列式存储约4MB,如果数据量更大的话这个对比更加明显。
另外我们可以看到,使用in-memory组件,相关的逻辑读仅有5,而且没有物理读,这个也是该组件的高效之处。
但是我们细心的朋友也会发现一个问题,普通查询耗时3.51秒,但是内存组件查询耗时3.42秒,我们看到后者的指标要比前者漂亮的多,但是性能方面并没有想象中的提高,这个是为什么呢?
经过咨询,这个问题可能是:inmemory是列式存储,数据经过压缩的。它的优势是针对某些列的分析型操作。你如果只是把数据拿出来,数据库需要把列数据拼成行数据,相对于普通的行式存储还要干额外的工作,当然要慢了。
PS:因为虚拟机的问题,我们测试都是在同一条件下进行,结果可能有所不同,但是希望能够说明相关的问题。
6:其他
当然,Oracle也提供了In-Memory 的视图来帮助用户进行分析
v$im_segments
SQL> desc v$im_segments Name Null? Type ----------------------------------------- -------- ---------------------------- OWNER VARCHAR2(128) SEGMENT_NAME VARCHAR2(128) PARTITION_NAME VARCHAR2(128) SEGMENT_TYPE VARCHAR2(18) TABLESPACE_NAME VARCHAR2(30) INMEMORY_SIZE NUMBER BYTES NUMBER BYTES_NOT_POPULATED NUMBER POPULATE_STATUS VARCHAR2(9) INMEMORY_PRIORITY VARCHAR2(8) INMEMORY_DISTRIBUTE VARCHAR2(15) INMEMORY_DUPLICATE VARCHAR2(13) INMEMORY_COMPRESSION VARCHAR2(17) CON_ID NUMBER SQL> select inmemory_size/1024/1024,bytes/1024/1024 from v$im_segments where segment_name='T2'; INMEMORY_SIZE/1024/1024 BYTES/1024/1024 ----------------------- --------------- 4.125 12.5
user_tables表也会多了几项关于INMEMORY的相关信息
SQL> desc user_tables Name Null? Type ----------------------------------------- -------- ---------------------------- TABLE_NAME NOT NULL VARCHAR2(128) TABLESPACE_NAME VARCHAR2(30) CLUSTER_NAME VARCHAR2(128) IOT_NAME VARCHAR2(128) STATUS VARCHAR2(8) PCT_FREE NUMBER PCT_USED NUMBER INI_TRANS NUMBER MAX_TRANS NUMBER INITIAL_EXTENT NUMBER NEXT_EXTENT NUMBER MIN_EXTENTS NUMBER MAX_EXTENTS NUMBER PCT_INCREASE NUMBER FREELISTS NUMBER FREELIST_GROUPS NUMBER LOGGING VARCHAR2(3) BACKED_UP VARCHAR2(1) NUM_ROWS NUMBER BLOCKS NUMBER EMPTY_BLOCKS NUMBER AVG_SPACE NUMBER CHAIN_CNT NUMBER AVG_ROW_LEN NUMBER AVG_SPACE_FREELIST_BLOCKS NUMBER NUM_FREELIST_BLOCKS NUMBER DEGREE VARCHAR2(10) INSTANCES VARCHAR2(10) CACHE VARCHAR2(5) TABLE_LOCK VARCHAR2(8) SAMPLE_SIZE NUMBER LAST_ANALYZED DATE PARTITIONED VARCHAR2(3) IOT_TYPE VARCHAR2(12) TEMPORARY VARCHAR2(1) SECONDARY VARCHAR2(1) NESTED VARCHAR2(3) BUFFER_POOL VARCHAR2(7) FLASH_CACHE VARCHAR2(7) CELL_FLASH_CACHE VARCHAR2(7) ROW_MOVEMENT VARCHAR2(8) GLOBAL_STATS VARCHAR2(3) USER_STATS VARCHAR2(3) DURATION VARCHAR2(15) SKIP_CORRUPT VARCHAR2(8) MONITORING VARCHAR2(3) CLUSTER_OWNER VARCHAR2(128) DEPENDENCIES VARCHAR2(8) COMPRESSION VARCHAR2(8) COMPRESS_FOR VARCHAR2(30) DROPPED VARCHAR2(3) READ_ONLY VARCHAR2(3) SEGMENT_CREATED VARCHAR2(3) RESULT_CACHE VARCHAR2(7) CLUSTERING VARCHAR2(3) ACTIVITY_TRACKING VARCHAR2(23) DML_TIMESTAMP VARCHAR2(25) HAS_IDENTITY VARCHAR2(3) CONTAINER_DATA VARCHAR2(3) INMEMORY VARCHAR2(8) INMEMORY_PRIORITY VARCHAR2(8) INMEMORY_DISTRIBUTE VARCHAR2(15) INMEMORY_COMPRESSION VARCHAR2(17) INMEMORY_DUPLICATE VARCHAR2(13)
------------------------------------------------------------------
版权所有,文章允许转载,但必须以链接方式注明源地址,否则追究法律责任!
建议看到转载,请直接访问正版链接获得最新的ArcGIS技术文章
Blog: http://blog.csdn.net/linghe301
------------------------------------------------------------------
测试环境2:IBM 笔记本 W500 、Linux 6.4、8GB内存、Oracle12.1.0.2、ArcSDE10.3 数据介绍:空间数据 ST_Geometry存储,内存参数设置为3GB
1:首先看一下数据情况,面状数据subdltb约300W条记录,查询数据也是面状数据query,里面包含一个大的要素
11:40:28 SQL> select count(*) from subdltb; COUNT(*) ---------- 2999999 Elapsed: 00:00:11.09 11:40:53 SQL> select sde.st_area(shape) from query where objectid=3; SDE.ST_AREA(SHAPE) ------------------ 4.0640E+10 Elapsed: 00:00:00.42
2:使用ArcSDE for Oracle的ST_Intersects函数进行查询,然后进行sum求和
11:41:16 SQL> select sum(a.db2gse_st_) from subdltb a,query b where sde.st_intersects(a.shape,b.shape)=1 and b.objectid=3; SUM(A.DB2GSE_ST_) ----------------- 4451543224 Elapsed: 00:03:01.04 Execution Plan ---------------------------------------------------------- Plan hash value: 2821153078 -------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows |Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 |4648 | 2 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 |4648 | | | | 2 | NESTED LOOPS | | 27712 |122M | 2 (0)| 00:00:01 | | 3 | TABLE ACCESS BY INDEX ROWID | QUERY | 1 |2324 | 0 (0)| 00:00:01 | |* 4 | INDEX UNIQUE SCAN | R7_SDE_ROWID_UK | 1 | | 0 (0)| 00:00:01 | | 5 | TABLE ACCESS BY INDEX ROWID | SUBDLTB | 27712 |61M | 2 (0)| 00:00:01 | |* 6 | DOMAIN INDEX (Sel: Default - No Stats)| SHAPE_92247_4_SIDX | | | 18E (0)| | -------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 4 - access("B"."OBJECTID"=3) 6 - access("SDE"."ST_INTERSECTS"("A"."SHAPE","B"."SHAPE")=1) Note ----- - dynamic statistics used: dynamic sampling (level=2) Statistics ---------------------------------------------------------- 733910 recursive calls 0 db block gets 835491 consistent gets 88615 physical reads 0 redo size 559 bytes sent via SQL*Net to client 552 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 677 sorts (memory) 0 sorts (disk) 1 rows processed
3:使用In-MEMORY组件进行测试
将相关要素类进行Inmemory
SQL> select * from v$inmemory_area; POOL ALLOC_BYTES USED_BYTES POPULATE_STATUS CON_ID -------------------------- ----------- ---------- -------------------------- ---------- 1MB POOL 2565865472 4194304 DONE 3 64KB POOL 637534208 51642368 DONE 3 SQL> alter table query inmemory; Table altered. SQL> select count(*) from query; COUNT(*) ---------- 4 SQL> select * from v$inmemory_area; POOL ALLOC_BYTES USED_BYTES POPULATE_STATUS CON_ID -------------------------- ----------- ---------- -------------------------- ---------- 1MB POOL 2565865472 4194304 DONE 3 64KB POOL 637534208 51642368 DONE 3 SQL> select bytes/1024 KB from user_segments where segment_name='QUERY'; KB ---------- 64我们发现,Query要素类并没有加入内存中,Oracle帮助有提示:Objects that are smaller than 64KB are not populated into memory ,Query数据刚好64KB。
然后将subdltb数据放入内存中。
SQL> alter table subdltb inmemory; Table altered. SQL> select count(*) from subdltb; COUNT(*) ---------- 2999999 SQL> select * from v$inmemory_area; POOL ALLOC_BYTES USED_BYTES POPULATE_STATUS CON_ID -------------------------- ----------- ---------- -------------------------- ---------- 1MB POOL 2565865472 849346560 POPULATING 3 64KB POOL 637534208 52297728 POPULATING 3 SQL> / POOL ALLOC_BYTES USED_BYTES POPULATE_STATUS CON_ID -------------------------- ----------- ---------- -------------------------- ---------- 1MB POOL 2565865472 2554331136 DONE 3 64KB POOL 637534208 52953088 DONE 3
我们看到,一开始查看v$inmemory_area的populate_status是populating,这是因为300W记录的数据,需要一定的时间写入内存中,所以需要稍等些时间状态才会变成DONE。然后查看一下v$im_segments表信息
SQL> select INMEMORY_SIZE/1024/1024,bytes/1024/1024 from v$im_segments where segment_name='SUBDLTB'; INMEMORY_SIZE/1024/1 BYTES/1024/1024 -------------------- --------------- 1182.25 1025我们发现,针对于ST_Geometry存储的数据,列式存储压缩之后比行式存储还要大,这个让人很不理解。
进行实际查询
SQL> set timing on SQL> set time on 12:36:09 SQL> set autot on 12:36:16 SQL> select sum(a.db2gse_st_) from subdltb a,query b where sde.st_intersects(a.shape,b.shape)=1 and b.objectid=3; SUM(A.DB2GSE_ST_) ----------------- 4451543224 Elapsed: 00:03:00.14 Execution Plan ---------------------------------------------------------- Plan hash value: 2821153078 ---------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 4648 | 2 (0) | 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 4648 | | | | 2 | NESTED LOOPS | | 27712 | 122M| 2 (0)| 00:00:01 | | 3 | TABLE ACCESS BY INDEX ROWID | QUERY | 1 | 2324 | 0 (0)| 00:00:01 | |* 4 | INDEX UNIQUE SCAN | R7_SDE_ROWID_UK | 1 | | 0 (0)| 00:00:01 | | 5 | TABLE ACCESS BY INDEX ROWID | SUBDLTB | 27712 | 61M| 2 (0)| 00:00:01 | |* 6 | DOMAIN INDEX (Sel: Default - No Stats)| SHAPE_92247_4_SIDX | | | 18E (0) | | ---------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 4 - access("B"."OBJECTID"=3) 6 - access("SDE"."ST_INTERSECTS"("A"."SHAPE","B"."SHAPE")=1) Note ----- - dynamic statistics used: dynamic sampling (level=2) Statistics ---------------------------------------------------------- 734531 recursive calls 0 db block gets 835836 consistent gets 87819 physical reads 0 redo size 559 bytes sent via SQL*Net to client 551 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 455 sorts (memory) 0 sorts (disk) 1 rows processed 12:39:25 SQL> select sum(a.db2gse_st_) from subdltb a,query b where sde.st_intersects(b.shape,a.shape)=1 and b.objectid=3; SUM(A.DB2GSE_ST_) ----------------- 4451543224 Elapsed: 00:12:46.23 Execution Plan ---------------------------------------------------------- Plan hash value: 209829830 ------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 4648 | 1851 (29)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 4648 | | | | 2 | NESTED LOOPS | | 27712 | 122M| 1851 (29)| 00:00:01 | | 3 | TABLE ACCESS BY INDEX ROWID| QUERY | 1 | 2324 | 0 (0)| 00:00:01 | |* 4 | INDEX UNIQUE SCAN | R7_SDE_ROWID_UK | 1 | | 0 (0)| 00:00:01 | |* 5 | TABLE ACCESS INMEMORY FULL | SUBDLTB | 27712 | 61M| 1851 (29)| 00:00:01 | ------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 4 - access("B"."OBJECTID"=3) 5 - filter("SDE"."ST_INTERSECTS"("B"."SHAPE","A"."SHAPE")=1) Note ----- - dynamic statistics used: dynamic sampling (level=2) Statistics ---------------------------------------------------------- 1801418 recursive calls 0 db block gets 1124 consistent gets 17 physical reads 0 redo size 559 bytes sent via SQL*Net to client 551 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 5 sorts (memory) 0 sorts (disk) 1 rows processed
看到这个结果,我觉得是否是空间索引表如果In memory是否有相关效果,结果发现,空间索引表不支持in memory,会报ora-64358错误
SQL> select index_id from st_geometry_index where table_name='SUBDLTB'; INDEX_ID ---------- 2 SQL> alter table s2_idx$ inmemory; alter table s2_idx$ inmemory * ERROR at line 1: ORA-64358: in-memory column store feature not supported for IOTs
4:结论
通过对比可以看到,虽然在内存中进行查询,大家都知道,st_instersects(a,b)需要传入两个参数,该函数a参数会走空间索引,b参数走全表扫描,所以尽可能将数据量大的放到a参数的位置,我按照最高效的方式测试,发现这种方式与普通查询没有任何差别,不管是逻辑读还是物理读和普通查询区别不大,查询时间也基本类似,如果我更换顺序,走比较低效的查询方式,果然,从执行计划指标上看,在内存中进行全表扫描,而且物理读和逻辑读明显减少,但是执行效率更加低效。
在内存技术方面,甲骨文并没有采用SAP HANA的“全内存”架构,数据会根据不同的“温度”来选择不同的处理方式,包含传统硬盘、闪存和内存三个层级,而不是把全部的数据都放到内存当中。Andy Mendelsohn介绍,在Oracle Database In-memory当中,最活跃或者说最热的数据将放到内存中进行分析,活跃度相对较低的数据会采用闪存(事实上,Oracle数据库是最早拥抱闪存的产品之一,在Exadata上已经大面积使用了闪存存储),而温度最低、最不活跃的数据还是会采用传统磁盘来存储。根据不同需求的数据采取不同的策略,这样做的好处在于,客户不必采购大量的内存设备就可以获得最佳性能提升,降低了总体成本,提升了投资回报率。
目前,Oracle的 IN-MEMORY组件还处于研究阶段,这方面的资料还比较少,该问题还在不断研究中,希望能够得到一些有些的解决方法!
当然Oracle的IN-MEMORY OPTION作为一个刚刚发布的组件还没有经过项目的实践,这不已经可以看到关于它的Bug问题了。
Oracle: That BUG in our In-Memory Option will be fixed in October
http://www.theregister.co.uk/2014/07/31/oracle_in_memory_bug_fix
------------------------------------------------------------------
版权所有,文章允许转载,但必须以链接方式注明源地址,否则追究法律责任!
建议看到转载,请直接访问正版链接获得最新的ArcGIS技术文章
Blog: http://blog.csdn.net/linghe301
欢迎添加微信公众号:ArcGIS技术分享(arcgis_share),直接回复1就可以在移动端获取最新技术文章
------------------------------------------------------------------
参考文档:
Oracle 12c新特性:IN-Memory Option - 列存与压缩:http://www.eygle.com/archives/2014/07/oracle_12c_inmemory_option_two.html
Oracle 12c In-Memory option:http://www.orasql.com/blog/archives/2014/07/23/12c_inmemory.htm
【Oracle Database 12c新特性】In-Memory Option:http://www.askmaclean.com/archives/12c-in-memory-option.html
inmemory option的简单介绍和测试:http://www.oracleblog.org/study-note/in-memory-option-simple-test/