Spark笔记(3):一行代码处理key相同时value相加

spark


问题描述

在商品推荐的业务逻辑计算时,遇到一个计算商品偏好权重的问题:实时权重要和离线权重结合,其中key相同的商品权重求和,不相同的保留实时和离线的权重不变,由于业务代码是实时计算,只能用scala处理(不会java = =|)具体数据如下:

时间节点1:
bean_json:online_rating       value=C01:B02:12.00#C01:B01:8.00
bean_json:real_time_rating    value=C01:B01:40.00:100#C01:B02:60.00:100   
时间节点2:
bean_json:online_rating       value=C01:B02:9.00#C01:B04:8.00#C01:B01:8.00#C01:B03:6.00                                                 
bean_json:real_time_rating    value=C01:B03:30.00:100#C01:B04:40.00:100#C01:B02:30.00:100 

业务计算逻辑

在时间节点1时
商品的实时权重为:
C01:B01 = 40.00  
C01:B02 = 60.00
离线权重 = 实时权重*衰减系数0.2
C01:B02 = 60*0.2 = 12.00  
C01:B01 = 40*0.2 = 8.00
时间节点2时
新的实时权重为:
C01:B03 = 30.00  
C01:B04 = 40.00 
C01:B02 = 30.00 
新的离线权重:
C01:B02 = (12.00 + 30*0.2)/2 = 9.00 
C01:B04 = 40*0.2 = 8.00
C01:B01 = 8.00 
C01:B03 = 30*0.2 = 6.00

摸索了很久终于在这篇文章找到了方法:scala 两个map合并,key相同时value相加

主要原理是将两个权重数据转成Map[key,value],使用函数getOrElse将两个Map合并时,做有条件的处理,条件就是两个Map的key是否相同,相同就做计算,不相同则直接保留。

一行代码即可得到结果,代码如下:

val unionMap = map1 ++ map2.map(t => t._1 -> (t._2 + map1.getOrElse(t._1, t._2))/2) //key相同权重求和除2

具体代码

/**
    * 计算新的权重
    * @param realTimeWeight 实时权重
    * @param onlineWeight   离线权重
    * @return 新权重
    */
def dealNewOnRating2(realTimeWeight:String,onlineWeight:String): String = {
    var newWeightArray = new StringBuilder

    val realTimeWeightRdd: Array[(String, Double)] = realTimeWeight.split("#").map(w=>(w,1)).map(x =>{
      val realTimeInfo = x._1.split(":")
      val cateInfo = realTimeInfo(0)+":"+realTimeInfo(1)
      val rtWeight = realTimeInfo(2).toDouble * 0.2
      (cateInfo,rtWeight)
    })

    val onlineWeightRdd: Array[(String, Double)] = onlineWeight.split("#").map(w=>(w,1)).map(x =>{
      val realTimeInfo = x._1.split(":")
      val cateInfo = realTimeInfo(0)+":"+realTimeInfo(1)
      val onWeight = realTimeInfo(2).toDouble
      (cateInfo,onWeight)
    })

    val map1 = realTimeWeightRdd.toMap
    val map2 = onlineWeightRdd.toMap
    val unionMap = map1 ++ map2.map(t => t._1 -> (t._2 + map1.getOrElse(t._1, t._2))/2) //key相同权重求和除2
    val unionWeight = unionMap.toArray.sortBy(_._2)(Ordering[Double].reverse).take(10)  //排序取top10

    for (i <- unionWeight.indices) {
      val cateInfo = unionWeight(i)._1
      val newWeight = unionWeight(i)._2.formatted("%.2f").toString
      newWeightArray ++= cateInfo++=":"++=newWeight++="#"
    }
    newWeightArray.toString
  }

你可能感兴趣的:(Spark笔记(3):一行代码处理key相同时value相加)