这篇文章介绍的是ORACLE CTXCAT索引的INDEX SET,也是CTXCAT索引特有的属性。
CTXCAT 索引是CONTEXT索引的简化版,CTXCAT索引支持的PREFERENCE包括:LEXER、STOPLIST、WORDLIST和STORAGE 参数。不支持其他的参数如:DATASTORE、FILTER、SECTION GROUP。虽然支持LEXER但不支持THEME查询,而且不支持FORMAT、CHARSET和LANGUAGE列,另外不支持表和索引分区。
CTXCAT 索引仅仅包含了CONTEXT索引的部分内容,但是CTXCAT索引有其自身的优点。其中最突出的优点就是支持DML同步。CONTEXT索引由于结构过 于复杂,且索引的数据量一般较大,因此CONTEXT索引并不是自动同步的。而CTXCAT索引是自动同步的,当发生了DML修改时,Oracle会自动 同步CTXCAT索引,降低了索引的维护成本。
CTXCAT 索引的另外一个优点就是这里要介绍的INDEX SET属性,这也是CTXCAT索引特有的属性。简单的说,CTXCAT可以建立一个索引集。可以把一些经常与CTXCAT查询组合使用的查询列的索引添 加到索引集中。比如,如果在查询文章内容的同时,经常需要查询文章的作者、标题或创建时间等信息,则可以将这些信息列的索引添加到索引集中,Oracle 可以将这些查询封装到CATSEARCH操作中,从而提高全文索引的效率。
下面看一个简单的例子:
[oracle@rhel140 ~]$ sqlplus myuser/myuser
SQL*Plus: Release 10.2.0.1.0 - Production on Sun May 24 23:19:47 2009
Copyright (c) 1982, 2005, Oracle. All rights reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options
SQL> CREATE TABLE T (ID NUMBER, TITLE VARCHAR2(256), CREATED DATE, DOCS VARCHAR2(4000));
Table created.
SQL> INSERT INTO T VALUES (1, 'ORACLE TEXT REFERENCE', TO_DATE('200203', 'YYYYMM'),
2 'This manual provides reference information for Oracle Text. Use it as a reference for
3 creating Oracle Text indexes, for issuing Oracle Text queries, for presenting
4 documents, and for using the Oracle Text PL/SQL packages.');
1 row created.
SQL> INSERT INTO T VALUES (2, 'ORACLE TEXT APPLICATION DEVELOPER''S GUIDE',
2 TO_DATE('200203', 'YYYYMM'),
3 'This guide explains how to build query applications with Oracle Text. This preface
4 contains these topics:');
1 row created.
SQL> INSERT INTO T VALUES (3, 'ORACLE SQL REFERENCE', TO_DATE('200210', 'YYYYMM'),
2 'This reference contains a complete description of the Structured Query Language
3 (SQL) used to manage information in an Oracle database. Oracle SQL is a superset
4 of the American National Standards Institute (ANSI) and the International
5 Standards Organization (ISO) SQL99 standard.');
1 row created.
SQL> COMMIT;
Commit complete.
SQL> CREATE INDEX IND_T_DOCS ON T (DOCS) INDEXTYPE IS CTXSYS.CTXCAT;
Index created.
SQL> SET AUTOT ON EXP
SQL> SELECT ID FROM T WHERE CATSEARCH(DOCS, 'SQL', NULL) > 0;
ID
----------
1
3
Execution Plan
----------------------------------------------------------
Plan hash value: 3282600506
------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2027 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 2027 | 2 (0)| 00:00:01 |
|* 2 | DOMAIN INDEX | IND_T_DOCS | | | | |
------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("CTXSYS"."CATSEARCH"("DOCS",'SQL',NULL)>0)
Note
-----
- dynamic sampling used for this statement
SQL> SELECT ID FROM T WHERE CATSEARCH(DOCS, 'SQL', NULL) > 0
2 AND CREATED = TO_DATE('200203', 'YYYYMM');
ID
----------
1
Execution Plan
----------------------------------------------------------
Plan hash value: 3282600506
------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2036 | 2 (0)| 00:00:01 |
|* 1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 2036 | 2 (0)| 00:00:01 |
|* 2 | DOMAIN INDEX | IND_T_DOCS | | | | |
------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("CREATED"=TO_DATE('2002-03-01 00:00:00', 'yyyy-mm-dd hh24:mi:ss'))
2 - access("CTXSYS"."CATSEARCH"("DOCS",'SQL',NULL)>0)
Note
-----
- dynamic sampling used for this statement
SQL> CREATE INDEX IND_T_CREATED ON T (CREATED);
Index created.
SQL> SELECT ID FROM T WHERE CATSEARCH(DOCS, 'SQL', NULL) > 0
2 AND CREATED = TO_DATE('200203', 'YYYYMM');
ID
----------
1
Execution Plan
----------------------------------------------------------
Plan hash value: 2040116302
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2036 | 4 (25)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID | T | 1 | 2036 | 4 (25)| 00:00:01 |
| 2 | BITMAP CONVERSION TO ROWIDS | | | | | |
| 3 | BITMAP AND | | | | | |
| 4 | BITMAP CONVERSION FROM ROWIDS| | | | | |
|* 5 | INDEX RANGE SCAN | IND_T_CREATED | 2 | | 1 (0)| 00:00:01 |
| 6 | BITMAP CONVERSION FROM ROWIDS| | | | | |
| 7 | SORT ORDER BY | | | | | |
|* 8 | DOMAIN INDEX | IND_T_DOCS | 2 | | | |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("CREATED"=TO_DATE('2002-03-01 00:00:00', 'yyyy-mm-dd hh24:mi:ss'))
8 - access("CTXSYS"."CATSEARCH"("DOCS",'SQL',NULL)>0)
Note
-----
- dynamic sampling used for this statement
上面的查询不仅包括了对DOCS列的全文索引查询还包括了对CREATED列的查询。Oracle采用了先将BTREE索引转化为BITMAP索引,在进行BITMAP索引的AND的方法进行了处理。而如果采用了INDEX SET的方法,则不需要如此复杂的转化:
SQL> DROP INDEX IND_T_DOCS;
Index dropped.
SQL> DROP INDEX IND_T_CREATED;
Index dropped.
SQL> BEGIN
2 CTX_DDL.CREATE_INDEX_SET('TEST_INDEXSET');
3 CTX_DDL.ADD_INDEX('TEST_INDEXSET', 'CREATED');
4 END;
5 /
PL/SQL procedure successfully completed.
SQL> CREATE INDEX IND_T_DOCS ON T (DOCS) INDEXTYPE IS CTXSYS.CTXCAT
2 PARAMETERS ('INDEX SET TEST_INDEXSET');
Index created.
SQL> SET AUTOT ON EXP
SQL> SELECT ID FROM T WHERE CATSEARCH(DOCS, 'SQL', 'CREATED = TO_DATE(''200203'', ''YYYYMM'')') > 0;
ID
----------
1
Execution Plan
----------------------------------------------------------
Plan hash value: 3282600506
------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2027 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 2027 | 2 (0)| 00:00:01 |
|* 2 | DOMAIN INDEX | IND_T_DOCS | | | | |
------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("CTXSYS"."CATSEARCH"("DOCS",'SQL','CREATED = TO_DATE(''200203'',
''YYYYMM'')')>0)
Note
-----
- dynamic sampling used for this statement
处理支持结构性查询条件外,使用INDEX SET索引集还可以支持索引列的排序操作
SQL> SET AUTOT OFF
SQL> SELECT ID FROM T WHERE CATSEARCH(DOCS, 'SQL', NULL) > 0;
ID
----------
1
3
SQL> SELECT ID FROM T WHERE CATSEARCH(DOCS, 'SQL', 'ORDER BY CREATED DESC') > 0;
ID
----------
3
1
这种将索引添加到索引集的操作也有一定的限制,比如结构性查询的操作只能包含大于、等于、小于、IN、BETWEEN AND操作。另外,加入到索引集中索引列的大小也有限制,对于VARCHAR2和CHAR类型,大小不能超过30字符,否则会报错:
用了不等于操作
SQL> SELECT ID FROM T WHERE CATSEARCH(DOCS, 'SQL', 'CREATED != TO_DATE(''200203'', ''YYYYMM'')') >0;
SELECT ID FROM T WHERE CATSEARCH(DOCS, 'SQL', 'CREATED != TO_DATE(''200203'', ''YYYYMM'')') >0
*
ERROR at line 1:
ORA-29902: error in executing ODCIIndexStart() routine
ORA-20000: Oracle Text error:
DRG-10844: index cannot execute this structured predicate efficiently
DRG-10845: column CREATED ! is not indexed
建立超过30个字符的索引
SQL> DROP INDEX IND_T_DOCS;
Index dropped.
SQL> exec CTX_DDL.ADD_INDEX('TEST_INDEXSET', 'TITLE');
PL/SQL procedure successfully completed.
SQL> CREATE INDEX IND_T_DOCS ON T (DOCS) INDEXTYPE IS CTXSYS.CTXCAT
2 PARAMETERS ('INDEX SET TEST_INDEXSET');
CREATE INDEX IND_T_DOCS ON T (DOCS) INDEXTYPE IS CTXSYS.CTXCAT
*
ERROR at line 1:
ORA-29855: error occurred in the execution of ODCIINDEXCREATE routine
ORA-20000: Oracle Text error:
DRG-12306: column TITLE is too long for index set columns
ORA-06512: at "CTXSYS.DRUE", line 160
ORA-06512: at "CTXSYS.CATINDEXMETHODS", line 97
SQL> select id,length(title) from t;
ID LENGTH(TITLE)
---------- -------------
1 21
2 41
3 20
SQL> update t set title=substr(title,1,30) where id=2;
update t set title=substr(title,1,30) where id=2
*
ERROR at line 1:
ORA-29861: domain index is marked LOADING/FAILED/UNUSABLE
SQL> drop index IND_T_DOCS;
Index dropped.
SQL> update t set title=substr(title,1,30) where id=2;
1 row updated.
SQL> commit;
Commit complete.
SQL> ALTER TABLE MYUSER.T MODIFY(TITLE VARCHAR2(30 BYTE));
Table altered.
SQL> CREATE INDEX IND_T_DOCS ON T (DOCS) INDEXTYPE IS CTXSYS.CTXCAT
2 PARAMETERS ('INDEX SET TEST_INDEXSET');
Index created.
将字段长度改为varchar2(30)则可以建立