Apache Doris 02|导入数据遇到的问题

1、broker load 数据导入失败

load label example_db.stuscore (data infile ("hdfs://devtest4.com:50070/tmp/testdata/stuscore.txt") into table stuscore) with broker 'broker_name' ("username"="root","password"="");

查看数据导入状态。

show load order by createtime desc limit 1\G

ErrorMsg: type:ETL_RUN_FAIL; msg:errCode = 2, detailMessage = Broker list path failed. path=hdfs://devtest4.com:50075/tmp/test1/tabledata1,broker=TNetworkAddress(hostname:192.168.11.37, port:8000),msg=unknown error when get file status, cause by: Call From devtest1.com/192.168.11.37 to devtest4.com:50075 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

显示数据路径失败,连接拒绝。先查看hdfs使用的端口是不是50070

#获取端口号
hdfs getconf -confKey fs.default.name
#查看文件列表
hdfs dfs -ls hdfs://devtest4.com:8020/

真的是端口问题呀。再次执行

load label example_db.stuscore (data infile ("hdfs://devtest4.com:8020/tmp/testdata/stuscore.txt") into table stuscore) with broker 'broker_name' ("username"="root","password"="");

 出现ErrorMsg: type:ETL_QUALITY_UNSATISFIED; msg:quality not good enough to cancel

数据质量问题,指定数据的分隔符,插入列等信息。使用desc table1 查看数据列字段。

LOAD LABEL example_db.stuscore01 ( DATA INFILE("hdfs://devtest4.com:8020/tmp/testdata/stuscore.txt") INTO TABLE stuscore COLUMNS TERMINATED BY "," (id,name,score) SET (id=id,name=name,score=score)) WITH BROKER 'broker_name' ("username"="root","password"="") PROPERTIES ("timeout" = "3600");

State: FINISHED。数据导入成功。
 

2、因为数据质量问题导入不成功

使用steam load方式导入,在导入时设置max_filter_ratio,默认是零容忍错误导入。

curl --location-trusted -u root  -H "label:bigtable20210617_01" -H "column_separator:\t" -H "max_filter_ratio:0.9" -T bigtable http://devtest1.com:18030/api/example_db/bigtable/_stream_load
 

你可能感兴趣的:(Doris)