背景介绍
- 业务场景:spark批量写入es,基于es-hadoop组件实现
- 批处理任务定时调度
- cdh5.5.3集群,spark2.3,elasticsearch6.4.3
- es中对应索引的_id由程序控制,保证全局唯一
- 仅测试环境出现,且偶尔出现
问题描述
完整报错信息如下:
19/05/20 11:08:54 ERROR yarn.ApplicationMaster: User class threw exception: org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 24.0 failed 4 times, most recent failure: Lost task 2.3 in stage 24.0 (TID 849, p016d052n01, executor 6): org.elasticsearch.hadoop.EsHadoopException: Could not write all entries for bulk operation [24/1000]. Error sample (first [5] error messages):
org.elasticsearch.hadoop.rest.EsHadoopRemoteException: version_conflict_engine_exception: [offline_quota_library_s][OZVIK_2462056_2019-05-18]: version conflict, document already exists (current version [1])
{
"update":{
"_id":"OZVIK_2462056_2019-05-18"}}
{
"doc_as_upsert":true,"doc":{
"id":"OZVIK_2462056_2019-05-18","product_no":"OZVIK","cust_id":"2462056","p106":32,"p107":61,"p108":55,"p109":"YGM6E","p110":1,"p111":46,"p112":11126,"p113":189,"p114":70,"p115":6,"p116":60,"p117":"male","p118":"gg","p119":19,"p120":2,"p121":1544025600000,"p122":69,"p123":"FL0SS","dt":"2019-05-18","absum01":71,"testday01":76,"testday02":11202,"testday03":"7611202","testday04":"70male","testday04_2":22404,"testday05":"761120270male761120222404","amount01":"YGM6E2462056","amount02":22252,"amount03":"OZVIK","aa":11197,"testb21":93,"fix_const_999_0222":999,"0304tf":"999 2462056 YGM6E","0305test_long":11173,"hello":87,"datetest":"2019-05-18","binarytest":32,"nestedtest":"YGM6E","aaaaaaaaaaaaaaaaabaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa":"OZVIK","floattest02":1,"__namelist_54":"0"}}
org.elasticsearch.hadoop.rest.EsHadoopRemoteException: version_conflict_engine_exception: [offline_quota_library_s][OZWTC_148752_2019-05-18]: version conflict, document already exists (current version [1])
{
"update":{
"_id":"OZWTC_148752_2019-05-18"}}
{
"doc_as_upsert":true,"doc":{
"id":"OZWTC_148752_2019-05-18","product_no":"OZWTC","cust_id":"148752",