记录apache doris使用过程中出现的问题

1,执行创建语句过程中出现:

[Err] 1064 - errCode = 2, detailMessage = Failed to find enough host in all backends. need: 3

原因:

语句中指定了 PROPERTIES("replication_num" = "3");

结果BE只有2个:

记录apache doris使用过程中出现的问题_第1张图片

查看对应节点的日志:.

==> ./be.WARNING.log.20200921-141304 <==
W1026 18:13:39.139992 19091 utils.cpp:101] fail to get master client from cache. host=192.168.6.143, port=9020, code=7
W1026 18:13:39.140386 19091 task_worker_pool.cpp:1185] finish report olap table state failed. status:-1, master host:192.168.6.143, port:9020
W1026 18:13:40.391201 19089 utils.cpp:101] fail to get master client from cache. host=192.168.6.143, port=9020, code=7
W1026 18:13:40.391471 19089 task_worker_pool.cpp:1060] finish report task failed. status:-1, master host:192.168.6.143port:9020
W1027 10:00:31.385262  2359 data_dir.cpp:128] open file filed, error: IO error: failed to open cluster id file /wyyt/software/doris/be/storage/cluster_id
W1027 10:00:31.385926  2359 data_dir.cpp:95] _init_cluster_id failed, error: IO error: failed to open cluster id file /wyyt/software/doris/be/storage/cluster_id
W1027 10:00:31.385958  2359 storage_engine.cpp:192] Store load failed, status=IO error: failed to open cluster id file /wyyt/software/doris/be/storage/cluster_id, path=/wyyt/software/doris/be/storage
W1027 10:00:31.386071  2353 storage_engine.cpp:148] _init_store_map failed, error: Internal error: init path failed, error=IO error: failed to open cluster id file /wyyt/software/doris/be/storage/cluster_id;
W1027 10:00:31.386106  2353 storage_engine.cpp:96] open engine failed, error: Internal error: init path failed, error=IO error: failed to open cluster id file /wyyt/software/doris/be/storage/cluster_id;
F1027 10:00:31.386186  2353 doris_main.cpp:189] fail to open StorageEngine, res=init path failed, error=IO error: failed to open cluster id file /wyyt/software/doris/be/storage/cluster_id;
 

找到原因之后,解决问题。我这里是打开文件失败,权限给755试试,然后重启BE节点。

如果重启失败,直接删除  be.pid ,再重启

2,日志权限用户变更了

记录apache doris使用过程中出现的问题_第2张图片

启动服务的时候是什么用户就是什么用户

3,创建doris表报错

记录apache doris使用过程中出现的问题_第3张图片

原因:字段长度数字加起来不能超过10W。如果要改,可以设置,但是不推荐

记录apache doris使用过程中出现的问题_第4张图片

4,磁盘满了 

ErrorReason{code=errCode = 2, msg='failed to create task: errCode = 2, detailMessage = disk 6189104187500640169 on backend 11001 exceed limit usage'

导致所有的任务暂停;

记录apache doris使用过程中出现的问题_第5张图片

5,开启物化视图

create materialized view test_p_user_view as select user_id,user_name from test_p_user limit 8;
ERROR 1064 (HY000): errCode = 2, detailMessage = The materialized view is coming soon

解决:可以在master上执行这个命令 ADMIN SET FRONTEND CONFIG ("enable_materialized_view" = "true");

目前物化视图只支持duplicate key 表,而且0.12只支持部分,0.13版本会完善

6,hive数据导入到doris流程

1,在doris创建对应的表

记录apache doris使用过程中出现的问题_第6张图片

2,执行语句

记录apache doris使用过程中出现的问题_第7张图片

7,type:LOAD_RUN_FAIL; msg:errCode = 2, detailMessage = there is no scanNode Backend  

从hdfs导入大表导致be节点挂掉

解决方案:对fe进行参数设置

记录apache doris使用过程中出现的问题_第8张图片

任务要显示指定内存:

记录apache doris使用过程中出现的问题_第9张图片

查看be日志,查看core文件,查看是否是OOM。

参考:https://blog.csdn.net/weixin_42135997/article/details/80732658

https://blog.csdn.net/qq_15437667/article/details/83934113?utm_medium=distribute.pc_aggpage_search_result.none-task-blog-2~all~sobaiduend~default-1-83934113.nonecase&utm_term=linux%20%E6%80%8E%E4%B9%88%E7%9C%8Bcore%E6%96%87%E4%BB%B6&spm=1000.2123.3001.4430

8,突然之间执行不了命令

查看be节点,是Alive状态。

记录apache doris使用过程中出现的问题_第10张图片

查看be节点日志 be.INFO be.WARN 日志都没发现啥

后来发现是一个节点的磁盘出问题了 ,以后遇到这种问题,就晓得怎么排查了。。

9,broker 导入hdfs数据规则

1)验证了broker导入hdfs数据,导入数据使用uniq模式的情况下。相同主键覆盖不是有序,而是按照第二个字段的长度来替换的(第二个字段长度最大,相同长度则取时间最新的。),如果第二个字段一样,同理,比较第三个字段长度。

记录apache doris使用过程中出现的问题_第11张图片

结果数据:

记录apache doris使用过程中出现的问题_第12张图片

10,Doris broker导入数据失败

type:LOAD_RUN_FAIL; msg:errCode = 2, detailMessage = all partitions have no load data

原始表数据为null。没数据

11,同时执行多个broker任务导致BE节点挂掉

原因:应该是内存不足的原因导致BE死掉。

解决方案:broker 单节点限制每次1个G,或者更小

12,routine laod 报错  errCode = 2, detailMessage = failed to send task: errCode = 2, detailMessage = failed

BE的任务并发是默认 max_routine_load_task_num_per * be数量  

比如be节点有3个,那么所有的并发是 5*3 

记录apache doris使用过程中出现的问题_第13张图片

13,通过insert into 

记录apache doris使用过程中出现的问题_第14张图片

14,导入任务失败

内存不够,修改内存

15,ETL_QUALITY_UNSATISFIED; msg:quality not good enough to cancel

异常说明:数据质量不好,导致不能doris不能解析或者解析失败而取消导入任务

可能原因:

1. varchar字段太长;分隔符问题

2.  too_many_filtered_rows

解决方案

长文本不要导入;长文本导入截断;数据中包含分隔符

16,使用broker导入数据到doris之后,发现内存没有释放

解决方案:

尝试升级doris版本为0.13.15,验证这个问题:

地址:https://cloud.baidu.com/doc/PALO/s/Ikivhcwb5

记录apache doris使用过程中出现的问题_第15张图片

17,出现的错误

 doris版本为 0.13.11 补丁版本。

记录apache doris使用过程中出现的问题_第16张图片

记录apache doris使用过程中出现的问题_第17张图片

18,出现be节点的data目录很大,有的be节点目录很正常。

初步判断原因集群负载有问题,routine load写入太频繁

查看表是否正常:

修改routine load参数  ,设置为60s

(
 'desired_concurrent_number'='3',
 'max_batch_interval' = '60',
 'max_batch_rows' = '300000',
 'max_batch_size' = '209715200',
 'strict_mode' = 'false',
 'format' = 'json'
)

19,doris版本 0.14.7 升级之后解决之前存在的问题 Too Many Tasks ................

记录apache doris使用过程中出现的问题_第18张图片

20,doris 0.14.7 内网3个fe部署之后写入数据以后,fe有节点挂掉,具体日志:

2021-08-27 09:09:25,172 ERROR (heartbeat mgr|19) [BDBJEJournal.write():166] catch an exception when writing to database. sleep and retry. journal id 1526718
com.sleepycat.je.rep.InsufficientAcksException: (JE 7.3.7) Transaction: -16160910  VLSN: 31,775,195, initiated at: 09:09:22.  Insufficient acks for policy:SIMPLE_MAJORITY. Need replica acks: 1. Missing replica acks: 1. Timeout: 2000ms. FeederState=192.168.7.5_9010_1625132780567(2)[MASTER]
Current feeds:
 192.168.7.7_9010_1625192915300: feederVLSN=31,775,198 replicaTxnEndVLSN=31,775,193
 192.168.7.4_9010_1625132697001: feederVLSN=31,775,198 replicaTxnEndVLSN=31,775,191

        at com.sleepycat.je.rep.impl.node.DurabilityQuorum.ensureSufficientAcks(DurabilityQuorum.java:205) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.rep.stream.FeederTxns.awaitReplicaAcks(FeederTxns.java:189) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.rep.impl.RepImpl.postLogCommitHookInternal(RepImpl.java:1426) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.rep.impl.RepImpl.postLogCommitHook(RepImpl.java:1385) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.rep.txn.MasterTxn.postLogCommitHook(MasterTxn.java:226) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.txn.Txn.commit(Txn.java:772) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.txn.Txn.commit(Txn.java:625) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.txn.Txn.operationEnd(Txn.java:1803) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.Database.put(Database.java:1506) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.Database.put(Database.java:1556) ~[je-7.3.7.jar:7.3.7]
        at org.apache.doris.journal.bdbje.BDBJEJournal.write(BDBJEJournal.java:159) [palo-fe.jar:3.4.0]
        at org.apache.doris.persist.EditLog.logEdit(EditLog.java:849) [palo-fe.jar:3.4.0]
        at org.apache.doris.persist.EditLog.logHeartbeat(EditLog.java:1265) [palo-fe.jar:3.4.0]
        at org.apache.doris.system.HeartbeatMgr.runAfterCatalogReady(HeartbeatMgr.java:154) [palo-fe.jar:3.4.0]
        at org.apache.doris.common.util.MasterDaemon.runOneCycle(MasterDaemon.java:58) [palo-fe.jar:3.4.0]
        at org.apache.doris.common.util.Daemon.run(Daemon.java:116) [palo-fe.jar:3.4.0]
2021-08-27 09:09:27,884 WARN (Thread-49|192) [BDBJEMetricHandler.write():117] write metric data into bdb error, key:192.168.7.7:8030_query_err_rate_1630026555000
com.sleepycat.je.rep.InsufficientAcksException: (JE 7.3.7) Transaction: -16160912  VLSN: 31,775,198, initiated at: 09:09:23.  Insufficient acks for policy:SIMPLE_MAJORITY. Need replica acks: 1. Missing replica acks: 1. Timeout: 2000ms. FeederState=192.168.7.5_9010_1625132780567(2)[MASTER]
Current feeds:
 192.168.7.7_9010_1625192915300: feederVLSN=31,775,199 replicaTxnEndVLSN=31,775,196
 192.168.7.4_9010_1625132697001: feederVLSN=31,775,199 replicaTxnEndVLSN=31,775,191

        at com.sleepycat.je.rep.impl.node.DurabilityQuorum.ensureSufficientAcks(DurabilityQuorum.java:205) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.rep.stream.FeederTxns.awaitReplicaAcks(FeederTxns.java:189) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.rep.impl.RepImpl.postLogCommitHookInternal(RepImpl.java:1426) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.rep.impl.RepImpl.postLogCommitHook(RepImpl.java:1385) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.rep.txn.MasterTxn.postLogCommitHook(MasterTxn.java:226) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.txn.Txn.commit(Txn.java:772) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.txn.Txn.commit(Txn.java:625) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.txn.Txn.operationEnd(Txn.java:1803) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.Database.put(Database.java:1506) ~[je-7.3.7.jar:7.3.7]
        at org.apache.doris.metric.collector.BDBJEMetricHandler.write(BDBJEMetricHandler.java:115) ~[palo-fe.jar:3.4.0]
        at org.apache.doris.metric.collector.BDBJEMetricHandler.writeDouble(BDBJEMetricHandler.java:109) ~[palo-fe.jar:3.4.0]
        at org.apache.doris.metric.collector.MetricCollector.parseFeMetricJsonAndWriteMetric(MetricCollector.java:217) ~[palo-fe.jar:3.4.0]
        at org.apache.doris.metric.collector.MetricCollector.writeMetric(MetricCollector.java:105) ~[palo-fe.jar:3.4.0]
        at org.apache.doris.metric.collector.MetricCollector.lambda$init$0(MetricCollector.java:77) ~[palo-fe.jar:3.4.0]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_162]
2021-08-27 09:09:33,338 WARN (Thread-49|192) [BDBJEMetricHandler.write():117] write metric data into bdb error, key:192.168.7.7:8030_quantile0.75_1630026555000
com.sleepycat.je.rep.InsufficientAcksException: (JE 7.3.7) Transaction: -16160913  VLSN: 31,775,200, initiated at: 09:09:27.  Insufficient acks for policy:SIMPLE_MAJORITY. Need replica acks: 1. Missing replica acks: 1. Timeout: 2000ms. FeederState=192.168.7.5_9010_1625132780567(2)[MASTER]
Current feeds:
 192.168.7.7_9010_1625192915300: feederVLSN=31,775,202 replicaTxnEndVLSN=31,775,198
 192.168.7.4_9010_1625132697001: feederVLSN=31,775,202 replicaTxnEndVLSN=31,775,196

        at com.sleepycat.je.rep.impl.node.DurabilityQuorum.ensureSufficientAcks(DurabilityQuorum.java:205) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.rep.stream.FeederTxns.awaitReplicaAcks(FeederTxns.java:189) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.rep.impl.RepImpl.postLogCommitHookInternal(RepImpl.java:1426) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.rep.impl.RepImpl.postLogCommitHook(RepImpl.java:1385) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.rep.txn.MasterTxn.postLogCommitHook(MasterTxn.java:226) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.txn.Txn.commit(Txn.java:772) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.txn.Txn.commit(Txn.java:625) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.txn.Txn.operationEnd(Txn.java:1803) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.Database.put(Database.java:1506) ~[je-7.3.7.jar:7.3.7]
        at org.apache.doris.metric.collector.BDBJEMetricHandler.write(BDBJEMetricHandler.java:115) ~[palo-fe.jar:3.4.0]
        at org.apache.doris.metric.collector.BDBJEMetricHandler.writeDouble(BDBJEMetricHandler.java:109) ~[palo-fe.jar:3.4.0]
        at org.apache.doris.metric.collector.MetricCollector.parseFeMetricJsonAndWriteMetric(MetricCollector.java:247) ~[palo-fe.jar:3.4.0]
        at org.apache.doris.metric.collector.MetricCollector.writeMetric(MetricCollector.java:105) ~[palo-fe.jar:3.4.0]
        at org.apache.doris.metric.collector.MetricCollector.lambda$init$0(MetricCollector.java:77) ~[palo-fe.jar:3.4.0]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_162]
2021-08-27 09:09:37,283 ERROR (heartbeat mgr|19) [BDBJEJournal.write():166] catch an exception when writing to database. sleep and retry. journal id 1526718
com.sleepycat.je.rep.InsufficientAcksException: (JE 7.3.7) Transaction: -16160914  VLSN: 31,775,202, initiated at: 09:09:30.  Insufficient acks for policy:SIMPLE_MAJORITY. Need replica acks: 1. Missing replica acks: 1. Timeout: 2000ms. FeederState=192.168.7.5_9010_1625132780567(2)[MASTER]
Current feeds:
 192.168.7.7_9010_1625192915300: feederVLSN=31,775,205 replicaTxnEndVLSN=31,775,200
 192.168.7.4_9010_1625132697001: feederVLSN=31,775,205 replicaTxnEndVLSN=31,775,196

        at com.sleepycat.je.rep.impl.node.DurabilityQuorum.ensureSufficientAcks(DurabilityQuorum.java:205) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.rep.stream.FeederTxns.awaitReplicaAcks(FeederTxns.java:189) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.rep.impl.RepImpl.postLogCommitHookInternal(RepImpl.java:1426) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.rep.impl.RepImpl.postLogCommitHook(RepImpl.java:1385) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.rep.txn.MasterTxn.postLogCommitHook(MasterTxn.java:226) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.txn.Txn.commit(Txn.java:772) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.txn.Txn.commit(Txn.java:625) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.txn.Txn.operationEnd(Txn.java:1803) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.Database.put(Database.java:1506) ~[je-7.3.7.jar:7.3.7]
        at com.sleepycat.je.Database.put(Database.java:1556) ~[je-7.3.7.jar:7.3.7]
        at org.apache.doris.journal.bdbje.BDBJEJournal.write(BDBJEJournal.java:159) [palo-fe.jar:3.4.0]
        at org.apache.doris.persist.EditLog.logEdit(EditLog.java:849) [palo-fe.jar:3.4.0]
        at org.apache.doris.persist.EditLog.logHeartbeat(EditLog.java:1265) [palo-fe.jar:3.4.0]
        at org.apache.doris.system.HeartbeatMgr.runAfterCatalogReady(HeartbeatMgr.java:154) [palo-fe.jar:3.4.0]
        at org.apache.doris.common.util.MasterDaemon.runOneCycle(MasterDaemon.java:58) [palo-fe.jar:3.4.0]
        at org.apache.doris.common.util.Daemon.run(Daemon.java:116) [palo-fe.jar:3.4.0]
2021-08-27 09:09:40,305 WARN (Thread-49|192) [BDBJEMetricHandler.write():117] write metric data into bdb error, key:192.168.7.7:8030_quantile0.95_1630026555000
com.sleepycat.je.rep.InsufficientAcksException: (JE 7.3.7) Transaction: -16160916  VLSN: 31,775,205, initiated at: 09:09:33.  Insufficient acks for policy:SIMPLE_MAJORITY. Need replica acks: 1. Missing replica acks: 1. Timeout: 2000ms. FeederState=192.168.7.5_9010_1625132780567(2)[MASTER]
 

如下图:

记录apache doris使用过程中出现的问题_第19张图片初步判断是不是心跳超时时间设置的太短了,因为测试这个版本没有调整任何参数。 
后来判断是不是fe元数据同步副本的时候写入失败,重试失败。

 重启了3次才起来:

记录apache doris使用过程中出现的问题_第20张图片

 

你可能感兴趣的:(apache,doris,doris报错,hive导入到doris,doris,错误汇总)