在特殊字符(不可见字符,如ASCII 0、控制字符、非法字符等)上,Oracle非常宽松,基本上可以写入任何数据,包括不符合编码规则的数据。而PostgreSQL则不然,必须符合该字符集的编码,比如在UTF8字符集下,输入的字符编码比如符合UTF8编码。
从报错中提取到十六进制\u0000
byteSize:1,"index":47,"rawData":"\u0000","type":"STRING"
ERROR StdoutPluginCollector - 脏数据:
{"exception":"ERROR: invalid byte sequence for encoding \"UTF8\": 0x00","record":[{"byteSize":1,"index":0,"rawData":"Z","type":"STRING"},{"byteSize":3,"index":1,"rawData":"SPD","type":"STRING"},{"byteSize":15,"index":2,"rawData":"SPD TA210_TA302","type":"STRING"},{"byteSize":16,"index":3,"rawData":"跨期套利-TA210/TA302","type":"STRING"},{"byteSize":4,"index":4,"rawData":"LIST","type":"STRING"},{"byteSize":0,"index":5,"type":"STRING"},{"byteSize":0,"index":6,"type":"STRING"},{"byteSize":1,"index":7,"rawData":"5","type":"DOUBLE"},{"byteSize":0,"index":8,"type":"STRING"},{"byteSize":1,"index":9,"rawData":"0","type":"DOUBLE"},{"byteSize":1,"index":10,"rawData":"0","type":"DOUBLE"},{"byteSize":2,"index":11,"rawData":"-1","type":"DOUBLE"},{"byteSize":1,"index":12,"rawData":"0","type":"DOUBLE"},{"byteSize":14,"index":13,"rawData":"20221021000000","type":"DOUBLE"},{"byteSize":14,"index":14,"rawData":"20221021000000","type":"DOUBLE"},{"byteSize":1,"index":15,"rawData":"0","type":"DOUBLE"},{"byteSize":14,"index":16,"rawData":"20221021000000","type":"DOUBLE"},{"byteSize":3,"index":17,"rawData":"TRD","type":"STRING"},{"byteSize":1,"index":18,"rawData":"2","type":"DOUBLE"},{"byteSize":4,"index":19,"rawData":"1000","type":"DOUBLE"},{"byteSize":1,"index":20,"rawData":"1","type":"DOUBLE"},{"byteSize":3,"index":21,"rawData":"200","type":"DOUBLE"},{"byteSize":1,"index":22,"rawData":"1","type":"DOUBLE"},{"byteSize":4,"index":23,"rawData":"1034","type":"DOUBLE"},{"byteSize":4,"index":24,"rawData":"-586","type":"DOUBLE"},{"byteSize":1,"index":25,"rawData":"0","type":"DOUBLE"},{"byteSize":1,"index":26,"rawData":"0","type":"DOUBLE"},{"byteSize":1,"index":27,"rawData":"0","type":"DOUBLE"},{"byteSize":1,"index":28,"rawData":"0","type":"DOUBLE"},{"byteSize":1,"index":29,"rawData":"0","type":"DOUBLE"},{"byteSize":1,"index":30,"rawData":"0","type":"DOUBLE"},{"byteSize":1,"index":31,"rawData":"0","type":"DOUBLE"},{"byteSize":1,"index":32,"rawData":"0","type":"DOUBLE"},{"byteSize":1,"index":33,"rawData":"0","type":"DOUBLE"},{"byteSize":1,"index":34,"rawData":"0","type":"DOUBLE"},{"byteSize":1,"index":35,"rawData":"0","type":"DOUBLE"},{"byteSize":1,"index":36,"rawData":"0","type":"DOUBLE"},{"byteSize":1,"index":37,"rawData":"0","type":"DOUBLE"},{"byteSize":1,"index":38,"rawData":"0","type":"DOUBLE"},{"byteSize":1,"index":39,"rawData":"0","type":"DOUBLE"},{"byteSize":1,"index":40,"rawData":"0","type":"DOUBLE"},{"byteSize":0,"index":41,"type":"STRING"},{"byteSize":0,"index":42,"type":"STRING"},{"byteSize":1,"index":43,"rawData":"0","type":"DOUBLE"},{"byteSize":1,"index":44,"rawData":"1","type":"DOUBLE"},{"byteSize":1,"index":45,"rawData":"5","type":"STRING"},{"byteSize":1,"index":46,"rawData":"0","type":"DOUBLE"},{"byteSize":1,"index":47,"rawData":"\u0000","type":"STRING"},{"byteSize":0,"index":48,"type":"STRING"},{"byteSize":1,"index":49,"rawData":"0","type":"DOUBLE"},{"byteSize":0,"index":50,"type":"STRING"},{"byteSize":1,"index":51,"rawData":"0","type":"DOUBLE"},{"byteSize":1,"index":52,"rawData":"0","type":"DOUBLE"},{"byteSize":1,"index":53,"rawData":"0","type":"DOUBLE"},{"byteSize":1,"index":54,"rawData":"0","type":"DOUBLE"},{"byteSize":0,"index":55,"type":"STRING"},{"byteSize":0,"index":56,"type":"STRING"},{"byteSize":2,"index":57,"rawData":"-1","type":"STRING"},{"byteSize":1,"index":58,"rawData":"N","type":"STRING"},{"byteSize":1,"index":59,"rawData":"0","type":"DOUBLE"},{"byteSize":1,"index":60,"rawData":"1","type":"DOUBLE"},{"byteSize":1,"index":61,"rawData":"1","type":"DOUBLE"},{"byteSize":1,"index":62,"rawData":"0","type":"DOUBLE"},{"byteSize":1,"index":63,"rawData":"N","type":"STRING"},{"byteSize":1,"index":64,"rawData":"0","type":"DOUBLE"},{"byteSize":1,"index":65,"rawData":"0","type":"STRING"},{"byteSize":14,"index":66,"rawData":"20220621220012","type":"DOUBLE"},{"byteSize":1,"index":67,"rawData":"Y","type":"STRING"},{"byteSize":1,"index":68,"rawData":"0","type":"STRING"},{"byteSize":4,"index":69,"rawData":"1000","type":"DOUBLE"},{"byteSize":1,"index":70,"rawData":"1","type":"DOUBLE"},{"byteSize":1,"index":71,"rawData":"N","type":"STRING"},{"byteSize":0,"index":72,"type":"STRING"},{"byteSize":1,"index":73,"rawData":"0","type":"STRING"},{"byteSize":0,"index":74,"type":"STRING"}],"type":"writer"}
2022-11-17 19:06:28.200 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[0] is successed, used[101]ms
创建一张测试表
create table test(id int, name varchar2(10), comm varchar2(30));
向这张测试表中插入以下数据
insert into test values(1, null, ‘null’); --直接插入null
insert into test values(2, ‘’, ‘empty string’); --插入空字符
insert into test values(3, ’ ', ‘blank space’); --插入空格
insert into test values(4, chr(0), ‘ascii0’); --插入ascii为0的字符
查测试表,可以看到第2列似乎都是空的
查看第2列不为空值的行
加trim函数再查
那么这个第4行第2列到底存的是什么呢,我们可以通过dump函数去查看
可以看到第4行的第2列跟第3行的第2列的空格一样,都是占了一个字节,而且他的ASCII码为0,那么我们可以通过下面的方式将第4行查出来
我们将这张表通过plsqldeveloper工具导出来,看是个什么情况
可以看到通过工具导出来的时候,第4行的第2列会直接转换成null
三、总结
ascii为0插入到oracle中,并不是null,它占用一个字节,要查询出来只能用=chr(0)尽量别插入ascii为0的字符到数据库中,可以用插入空字符串或者null代替,否则检索起来不方便碰到实在是不知道这个字段到底存的是什么的时候,可以使用dump函数去查看它具体的ascii码
PostgreSQL
字符集为UTF8,创建数据表,插入特殊字符(0x00),无法插入
[local:/data/run/pg12]:5120 pg12@testdb=# \encoding
UTF8
[local:/data/run/pg12]:5120 pg12@testdb=# drop table t_0x00;
insert into t_0x00 values(4,'c1'||E'\x00'||'c1',null);
insert into t_0x00 values(5,'c1'||E'\x00'||'c1','\x550055'::bytea);DROP TABLE
[local:/data/run/pg12]:5120 pg12@testdb=# create table t_0x00(id int,c1 varchar(200),c2 bytea);
CREATE TABLE
[local:/data/run/pg12]:5120 pg12@testdb=#
[local:/data/run/pg12]:5120 pg12@testdb=# insert into t_0x00 values(1,E'\x00',null);
ERROR: invalid byte sequence for encoding "UTF8": 0x00
[local:/data/run/pg12]:5120 pg12@testdb=# insert into t_0x00 values(1,chr(0),null);
ERROR: null character not permitted
[local:/data/run/pg12]:5120 pg12@testdb=# insert into t_0x00 values(2,null,bytea '\x00');
INSERT 0 1
[local:/data/run/pg12]:5120 pg12@testdb=# insert into t_0x00 values(3,E'\x00'||'c1',null);
ERROR: invalid byte sequence for encoding "UTF8": 0x00
[local:/data/run/pg12]:5120 pg12@testdb=# insert into t_0x00 values(4,'c1'||E'\x00'||'c1',null);
ERROR: invalid byte sequence for encoding "UTF8": 0x00
[local:/data/run/pg12]:5120 pg12@testdb=# insert into t_0x00 values(5,'c1'||E'\x00'||'c1','\x550055'::bytea);
ERROR: invalid byte sequence for encoding "UTF8": 0x00
[local:/data/run/pg12]:5120 pg12@testdb=# select * from t_0x00 where c1 like '%'||chr(0)||'%';
ERROR: null character not permitted
[local:/data/run/pg12]:5120 pg12@testdb=#