Spark开发:Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure问题

LZ最近在用spark清洗日志信息时(Scala编程),出现了一条异常:

Exception in thread “main” org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure:

追朔异常日志,经排查原因发现这条异常是访问的数组下标越界,这是因为日志数据不规范,有的日志过短,缺少相关信息,比如要提取URL:

 val url = splits(10)

只要有一条日志数据分隔长度小于10,即会出现该错误

特此记录下

你可能感兴趣的:(学习实践与记录)