Oracle字符集与字符类型存储空间占用

今天看到了乐大师新篇后,自己实验了一把

Oracle字符集与字符类型存储空间占用

 http://blog.csdn.net/leshami/article/details/51416387 


使用XMANGER  XSHELL 连接到LINUX客户端工具 设置格式为UTF-8

 

设置LINUX客户端语言环境 LANG是系统环境,NLS_LANG是数据库客户端环境

 

开另外个回话窗口

 

 

[root@oraclebak ~]# su - oracle

[oracle@oraclebak ~]sqlplus shark/shark

 

SQL*Plus: Release 11.2.0.1.0 Production on 星期二 5月 17 20:29:202016

Copyright (c) 1982, 2009, Oracle.  All rights reserved.

Connected to:

Oracle Database 11g Enterprise EditionRelease 11.2.0.1.0 - 64bit Production

With the Partitioning, OLAP, Data Miningand Real Application Testing options

 

查看数据库字符集

 

SQL> col value format a40

SQL> select * from nls_database_parameters where parameter like '%CHARACT%';

PARAMETER                     VALUE

----------------------------------------------------------------------

NLS_NUMERIC_CHARACTERS               .,

NLS_CHARACTERSET                  AL32UTF8

NLS_NCHAR_CHARACTERSET               AL16UTF16

 

一个汉字占三个字节 3 BYTES

SQL> select dump('鲨') from  dual;

DUMP('鲨')

-------------------------

Typ=96 Len=3: 233,178,168

 

SQL> exit

Disconnected from Oracle Database 11gEnterprise Edition Release 11.2.0.1.0 - 64bit Production

With the Partitioning, OLAP, Data Miningand Real Application Testing options


[oracle@oraclebak ~]env | grep LANG

NLS_LANG=SIMPLIFIED CHINESE_CHINA.AL32UTF8

LANG=zh_CN.UTF-8

[oracle@oraclebak ~]unset NLS_LANG

[oracle@oraclebak ~]env | grep LANG

LANG=zh_CN.UTF-8

[oracle@oraclebak ~]

 

SQL> select dump('鲨') from   dual;

 

DUMP('???')

-------------------------------------------------

Typ=96 Len=9:239,191,189,239,191,189,239,191,189

 

怎么变成了9个字节了呢?

 

 

这个原因可以确定涉及到NLS_LANG 因为这个并没有在数据库存储进去而是直接显示出来.

 

OK 我们建个表存点东西进去看看

SQL> create table tb_length(id int,col1varchar2(20), col2 nvarchar2(20));

 

Table created.

 

SQL> insert into tb_length values(1,'海鲨','海鲨');

 

1 row created.

 

SQL> commit;

 

Commit complete.

 

SQL> select * from tb_length;

 

         IDCOL1           COL2

---------- ----------------------------------------

          1 ??????                  ??????

 

SQL> select dump(col1),dump(col2) from   tb_length;

 

DUMP(COL1)

--------------------------------------------------------------------------------

DUMP(COL2)

--------------------------------------------------------------------------------

Typ=1 Len=18:239,191,189,239,191,189,239,191,189,239,191,189,239,191,189,239,19

1,189

Typ=1 Len=12:255,253,255,253,255,253,255,253,255,253,255,253

 

 

SQL> exit

退出来后我们把语言还原回去

Disconnected from Oracle Database 11gEnterprise Edition Release 11.2.0.1.0 - 64bit Production

With the Partitioning, OLAP, Data Miningand Real Application Testing options

[oracle@oraclebak ~]exportNLS_LANG="SIMPLIFIED CHINESE_CHINA.AL32UTF8"

[oracle@oraclebak ~]env | grep LANG

NLS_LANG=SIMPLIFIED CHINESE_CHINA.AL32UTF8

LANG=zh_CN.UTF-8

 

再进去看看

Connected to:

Oracle Database 11g Enterprise EditionRelease 11.2.0.1.0 - 64bit Production

With the Partitioning, OLAP, Data Miningand Real Application Testing options

 

SQL> select dump(col1),dump(col2) from   tb_length;

 

DUMP(COL1)

--------------------------------------------------------------------------------

DUMP(COL2)

--------------------------------------------------------------------------------

Typ=1 Len=18:239,191,189,239,191,189,239,191,189,239,191,189,239,191,189,239,19

1,189

Typ=1 Len=12:255,253,255,253,255,253,255,253,255,253,255,253

 

 

SQL> select * from tb_length;

 

         IDCOL1

---------- --------------------

COL2

--------------------------------------------------------------------------------

        1 ������

������

 

 天啦 依旧是乱码啊 What Fuck Ghost?


SQL> insert into tb_length values(1,'海鲨','海鲨');

 

1 row created.

 

SQL> commit;

 

Commit complete.

 

SQL> select * from tb_length;

 

 ID              COL1   COL2

---------- -  ---------------------------------------------------------------------------------------------------

        1 ������  ������

        1 海鲨   海鲨

SQL> select dump(col1),dump(col2) from   tb_length;

 DUMP(COL1)

--------------------------------------------------------------------------------

DUMP(COL2)

--------------------------------------------------------------------------------

Typ=1 Len=18:239,191,189,239,191,189,239,191,189,239,191,189,239,191,189,239,19

1,189

Typ=1 Len=12:255,253,255,253,255,253,255,253,255,253,255,253

 

Typ=1 Len=6: 230,181,183,233,178,168

Typ=1 Len=4: 109,119,156,168

 

OK 结论可以得到是NLS_LANG 是非常关键的语言参数 主要在客户端环境设置.

如果是空值将是乱码方式存入数据库,虽然我们采用XMANGE XSHELL工具设置的是UTF-8编码. 这个东东只是我们在WINDOWS下显示的结果.既是输入正确的汉字,也是错误的.

 

SQL> select dump('鲨') from  dual;

DUMP('???')

-------------------------------------------------

Typ=96 Len=9:239,191,189,239,191,189,239,191,189

 

 

这样就有三层  XSHELL->LINUX->DATABASE  三个字符集要设置正确

输入的话转换关系XSHELL->LINUX->DATABASE

输出的话转换关系DATABASE->LINUX-XSHELL

如果是用工具直接连数据库的话  中间就少了LINUX.

 

谈谈字符集存储问题

SQL> select lengthb(col1),lengthb(col2) from tb_length;

 

LENGTHB(COL1) LENGTHB(COL2)

------------- -------------

            18                12

             6                  4

 

SQL> select length(col1),length(col2) from tb_length;

 

LENGTH(COL1) LENGTH(COL2)

------------ ------------

            6                   6

            2                   2

 

存储方面varchar2AL32UTF8一个中文 3个字节 nvarchar2AL16UTF16 一个中文占2个字节

 

再看看通常中国字符集

PARAMETER                                VALUE

--------------------------------------------------------------------------------

NLS_NUMERIC_CHARACTERS                            .,

NLS_CHARACTERSET                              ZHS16GBK

NLS_NCHAR_CHARACTERSET                            AL16UTF16

 

insert into tb_length values(1,'海鲨','海鲨');

select dump(col1),dump(col2) from  tb_length;

A                                                          B                                                      COL1                         COL2

---------------------------------------   ----------------------------------------           -------------------------           --------------

Typ=1 Len=4: 186,163,246,232  Typ=1 Len=4: 109,119,156,168             海鲨                    海鲨

 

ZHS16GBK 字符集一个汉字占2Bytes

 

 
   
 
oracle 字符集是个比较麻烦的事情,好讨厌哦!! 欧巴. 清理空下的话 我认为 三组三层两个参数 
OS层
NLS_LANG="SIMPLIFIED CHINESE_CHINA.AL32UTF8"  --这个是显示数据库结果集
LANG=zh_CN.UTF-8                                                      --操作系统的语言
数据库
NLS_CHARACTERSET                      AL32UTF8    ---数据库字符集 对应VARCHAR2
NLS_NCHAR_CHARACTERSET         AL16UTF16   ---国家字符集  对应NVARCHAR2
数据库字段
VARCHAR2()
NVARCHR2()
一般情况下我们大陆人会设置ZHS16GBK数据库字符集,国家字符集 AL16UTF16
NLS_CHARACTERSET               ZHS16GBK
LINUX 操作系统带中文的 呵呵 反正中文包含英文的.
export NLS_LANG="SIMPLIFIED CHINESE_CHINA.ZHS16GBK"
EXPORT LANG= zh_CN.UTF-8
这样的话 1个汉字占2个字节,VARCHAR2(4000) 默认是BYTES 就可以存2K汉字. NVCHAR2()可以存???


关于字符集参考: 乐大师字符集全球化

http://blog.csdn.net/leshami/article/details/6030398

以及本鲨的Oracle 字符集

http://blog.csdn.net/zengmuansha/article/details/5661691

zh_CN.UTF-8 环境下 Putty 的配置

http://blog.csdn.net/zengmuansha/article/details/7814523

VARCHAR2 占几个字节?NLS_LENGTH_SEMANTICS,nls_language

http://blog.csdn.net/zengmuansha/article/details/46373443

NVARCHAR(MAXSIZE)

http://blog.csdn.net/zengmuansha/article/details/12949599

 

关于DUMP

Oracle dump函数的用法

http://blog.csdn.net/liuyuehui110/article/details/44617153

 

你可能感兴趣的:(NLS_LANG,varchar2,nvarchar2,NLS_CHARACTERSE)