mongo-scala某字段有则更新无则插入

问题:
SparkStreaming处理实时数据将统计结果写入mongo,用mongo-java的api需要做一层判断即对某个维度进行查找如果存在则把指标更新,如果不存在则插入维度与指标字段,这种方式耗时效率低下
换用mongo-scala的api使用其upsert方式实现插入与跟新,需要query的字段需在mongo中建立索引

/**
   * Performs an update operation.
   * @param q search query for old object to update
   * @param o object with which to update `q`
   */
  def update[A, B](q: A, o: B, upsert: Boolean = false, multi: Boolean = false,
                   concern:                  com.mongodb.WriteConcern = this.writeConcern,
                   bypassDocumentValidation: Option[Boolean]          = None)(implicit queryView: A => DBObject, objView: B => DBObject,
                                                                              encoder: DBEncoder = customEncoderFactory.map(_.create).orNull): WriteResult = {
    bypassDocumentValidation match {
      case None                   => underlying.update(queryView(q), objView(o), upsert, multi, concern, encoder)
      case Some(bypassValidation) => underlying.update(queryView(q), objView(o), upsert, multi, concern, bypassValidation, encoder)
    }
  }

添加依赖:


    
            org.mongodb
            casbah_2.11
            3.1.1
            pom
        
        
         
            org.mongodb.spark
            mongo-spark-connector_2.11
            ${spark.version}
        
        

注:pom中和需要去掉java-mongo-driver的依赖,否则冲突

你可能感兴趣的:(Spark)