Spark璇诲啓GBK鏂囦欢

  1. Spark 璇诲彇GBK鏂囦欢
sc.hadoopFile(path, classOf[TextInputFormat], classOf[LongWritable], classOf[Text], 1)
      .map(p => new String(p._2.getBytes, 0, p._2.getLength, "GBK"))
  1. Spark鍐橤BK鏂囦欢
val result: RDD[(NullWritable, Text)] = totalData.map {
        item =>
          val line = s"${item.query}"
          (NullWritable.get(), new Text(line.getBytes("GBK")))
      }
     //璁剧疆杈撳嚭鏍煎紡锛屼互GBK瀛樺偍
      result.saveAsNewAPIHadoopFile(path, classOf[NullWritable],
        classOf[Text], classOf[TextOutputFormat[NullWritable, Text]])

鍙傝��:

RDD琛屽姩Action鎿嶄綔(6)鈥搒aveAsHadoopFile

Spark澶氭枃浠惰緭鍑�(MultipleOutputFormat)

Hadoop澶氭枃浠惰緭鍑猴細MultipleOutputFormat鍜孧ultipleOutputs娣辩┒(涓�)

Hadoop澶氭枃浠惰緭鍑猴細MultipleOutputFormat鍜孧ultipleOutputs娣辩┒(浜�)

Hadoop 涓枃缂栫爜鐩稿叧闂 -- mapreduce绋嬪簭澶勭悊GBK缂栫爜鏁版嵁骞惰緭鍑篏BK缂栫爜鏁版嵁

你可能感兴趣的:(Spark璇诲啓GBK鏂囦欢)