Trafodion目前支持LOB类型,包括BLOB/CLOB,BLOB(Binary Large Object),主要用于存储非结构化数据,如图片、音频等,CLOB(Character Large Object),主要用于半结构化数据,如大文本,大字符串等。参考Apache Trafodion官网http://trafodion.apache.org/docs/lob_guide/index.html,可以详细了解LOB特性及使用方法。
LOB本身的数据存储在一个单独的HDFS文件,保存在/user/trafodion/lobs这个HDFS目录下,对应的Trafodion表中对每个LOB值存储一个唯一标识(LOB handle)。
当创建一个包含有LOB字段的表时,会相应的创建一些依赖对象用于存储LOB的元数据信息。
LOB Handle是用于描述一个LOB对象的。包含LOB字段的Trafodion表中每行会包含相应的Handle。
实际的LOB数据存储在HDFS文件(column store),LOB Handle描述LOB对象的位置、偏移量信息、描述信息,可以看成是LOB对象的唯一标识。
1 创建一个包含BLOB字段的表
SQL>cqd traf_blob_as_varchar 'off';
--- SQL operation complete.
SQL>create table t_blob(a int not null, b blob) primary key (a);
--- SQL operation complete.
2 查看表结构及相应描述信息
SQL>showddl t_blob;
CREATE TABLE TRAFODION.SEABASE.T_BLOB
(
A INT NO DEFAULT NOT NULL NOT DROPPABLE NOT
SERIALIZED
, B BLOB DEFAULT NULL NOT SERIALIZED
, PRIMARY KEY (A ASC)
)
ATTRIBUTES ALIGNED FORMAT NAMESPACE 'TRAF_RSRVD_3'
;
SQL>get tables;
Tables in Schema TRAFODION.SEABASE
==================================
LOBDescChunks__04001609182884681768_0001
LOBDescHandle__04001609182884681768_0001
LOBMD__04001609182884681768
T_BLOB
通过以上结果可知,创建一个有LOB类型的表,会额外创建三个独立的表,一个LOB MD表,两个LOB Desc表。另外,对于每个LOB字段,Trafodion把LOB数据存储在独立的HDFS目录/user/trafodion/lobs下。
3 查看上述表内容及文件路径
SQL>select * from LOBMD__04001609182884681768;
LOBNUM STORAGETYPE LOCATION COLUMN_NAME
------ ----------- -------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------------------------------------------
1 2 /user/trafodion/lobs/TRAF_1500000 B
SQL>select * from T_BLOB;
--- 0 row(s) selected.
hadoop fs -ls /user/trafodion/lobs/TRAF_1500000
Found 1 items
-rw-r--r-- 3 trafodion trafodion 0 2018-05-14 06:09 /user/trafodion/lobs/TRAF_1500000/LOBP_04001609182884681768_0001
上述LOBMD__04001609182884681768保存了LOB数据实际存储的位置信息,LOB编号,以及对应的字段名称。
1 创建一个包含CLOB字段的表
SQL>cqd traf_clob_as_varchar 'off';
--- SQL operation complete.
SQL>create table t_clob(a int not null, b clob) primary key (a);
--- SQL operation complete.
2 查看表结构及相应描述信息
SQL>showddl t_clob;
CREATE TABLE TRAFODION.SEABASE.T_CLOB
(
A INT NO DEFAULT NOT NULL NOT DROPPABLE NOT
SERIALIZED
, B CLOB DEFAULT NULL NOT SERIALIZED
, PRIMARY KEY (A ASC)
)
ATTRIBUTES ALIGNED FORMAT NAMESPACE 'TRAF_RSRVD_3'
;
SQL>get tables;
Tables in Schema TRAFODION.SEABASE
==================================
LOBDescChunks__04001609182884681768_0001
LOBDescChunks__08455106264966547072_0001
LOBDescHandle__04001609182884681768_0001
LOBDescHandle__08455106264966547072_0001
LOBMD__04001609182884681768
LOBMD__08455106264966547072
T_BLOB
T_CLOB
通过以下结果可以发现,每新增一个LOB字段,当前schema下就会多出3个独立的表,用于描述相应的LOB字段信息。
3 查看上述表内容及文件路径
SQL>select * from LOBMD__08455106264966547072;
LOBNUM STORAGETYPE LOCATION COLUMN_NAME
------ ----------- -------------------------------------------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------------------------------------------
1 2 /user/trafodion/lobs/TRAF_1500000 B
SQL>select * from t_clob;
--- 0 row(s) selected.
hadoop fs -ls /user/trafodion/lobs/TRAF_1500000
Found 2 items
-rw-r--r-- 3 trafodion trafodion 0 2018-05-14 06:09 /user/trafodion/lobs/TRAF_1500000/LOBP_04001609182884681768_0001
-rw-r--r-- 3 trafodion trafodion 0 2018-05-14 06:36 /user/trafodion/lobs/TRAF_1500000/LOBP_08455106264966547072_0001
1 插入null值
SQL>insert into t_blob values(1,null);
--- 1 row(s) inserted.
SQL>select * from t_blob;
A B
----------- --------------------------------------------------------------------------------------------------------------------------------
1 NULL
--- 1 row(s) selected.
2 插入empty_blob() –返回空的LOB handle
SQL>insert into t_blob values(2,empty_blob());
--- 1 row(s) inserted.
SQL>select * from t_blob;
A B
----------- --------------------------------------------------------------------------------------------------------------------------------
1 NULL
2 LOBH0000000200010400160918288468176819590156528453273693918212393055256514019021"TRAFODION"."SEABASE"
--- 2 row(s) selected.
3 插入本地图片(注:需将文件上传到所有数据库节点相同路径)
SQL>insert into t_blob values(3, filetolob('/opt/trafodion/a.png'));
--- 1 row(s) inserted.
SQL>select * from t_blob;
A B
----------- --------------------------------------------------------------------------------------------------------------------------------
1 NULL
2 LOBH0000000200010400160918288468176819590156528453273693918212393055256514019021"TRAFODION"."SEABASE"
3 LOBH0000000200010400160918288468176819400160919305233014918212393056573095236021"TRAFODION"."SEABASE"
--- 3 row(s) selected.
4 插入HDFS图片
SQL>insert into t_blob values(4, filetolob('hdfs:///tmp/a.png'));
--- 1 row(s) inserted.
SQL>select * from t_blob;
A B
----------- --------------------------------------------------------------------------------------------------------------------------------
1 NULL
2 LOBH0000000200010400160918288468176819590156528453273693918212393055256514019021"TRAFODION"."SEABASE"
3 LOBH0000000200010400160918288468176819400160919305233014918212393056573095236021"TRAFODION"."SEABASE"
4 LOBH0000000200010400160918288468176819845510627479221136218212393056796983880021"TRAFODION"."SEABASE"
--- 4 row(s) selected.
1 插入字符串
SQL>insert into t_clob values(1,stringtolob('ABCDEFG'));
--- 1 row(s) inserted.
SQL>select * from t_clob;
A B
----------- --------------------------------------------------------------------------------------------------------------------------------
1 LOBH0000000200010845510626496654707219174418018914969589618212393365889784172021"TRAFODION"."SEABASE"
--- 1 row(s) selected.
以下我们讲述了Trafodion 的LOB的基本使用,包括基本概念、如何创建LOB字段以及如何向LOB中插入数据。后续我们会接着介绍如何查询获取LOB字段中的数据。