java mongodb Aggregation复杂操作解决方案

java mongodb Aggregation复杂操作解决方案

问题背景

现在需要在查询mongodb,已经超时、已经不该存活并且需要进行操作的对象
目前我所想到的方法有如下几种:

  1. 使用最原始的方法(dsl、@Query或者springdata等等),然后再做比较时间/判断标记类型等等操作,这种逻辑在服务器做似乎有点消耗性能
  2. 使用mongo包里Aggregation原生的写法
  3. 使用springdata 提供的Aggregation.newAggregation() 写法

进化之路

刚开是的时候,第一种方法很明显不适合我们的做法,本来就是定时1~2秒去访问一次mongo,并把数据进行处理。这样就不能在服务端消耗太多的时间,直接pass。

第二种方法,废话少说直接上查询语句:

db.RefreshTask.aggregate([
                { $project: { _id:1, uri:1, scope:1, desc:1, pushState:1,
                collectState:1, collectInterval:1, lastCollectTime:1, refreshRequestHeaders:1,
                effectivePercent:1, hasDataVersion:1, childTaskAlivePeriod:1, parentId:1, createdTime:1,
                lastModifiedTime:1, lastCollectValue:1, lastPushValue:1, expireFlag:1, nextCollectTime:1,
                expireTime:{$add: [ "$createdTime", { $multiply: [ "$childTaskAlivePeriod", 1000 ] } ]} }},
        {$match:{ $or:[{expireFlag:{$eq:true}},
        				{$and:[{collectState:{$eq:1}}, {nextCollectTime:{$lt:ISODate()}}]},
        				{expireTime:{$lt:ISODate()}}]}}
        ]
    )

好的,刚刚接触的同学肯定已经懵了。这什么鬼!一层层复杂的嵌套。如果找一个大腿,他可能会告诉你,去看官文吧:https://docs.mongodb.com/manual/reference/operator/aggregation/ 没毛病!

但是,我经过一下段时间研究,代码发现好像还是挺简单的,用方法2-mongo官方提供的包做的话,写代码也就层层嵌套就完事了。比如下面这样:

		Date now = new Date();
        DBCollection dbCollection = mongodb.getCollection("RefreshTask");
        // $project
        BasicDBList addList = new BasicDBList();
        BasicDBList multiplyList = new BasicDBList();
        multiplyList.add("$childTaskAlivePeriod");
        multiplyList.add(1000);
        BasicDBObject $multiply = new BasicDBObject("$multiply", multiplyList);
        addList.add("$createdTime");
        addList.add($multiply);
        BasicDBObject $add = new BasicDBObject("$add", addList);

        DBObject $project = new BasicDBObject("$project",
                new BasicDBObject("_id", 1)
                .append("uri", 1)
                .append("scope", 1)
                .append("desc", 1)
                .append("pushState", 1)
                .append("collectState", 1)
                .append("collectInterval", 1)
                .append("lastCollectTime", 1)
                .append("refreshRequestHeaders", 1)
                .append("effectivePercent", 1)
                .append("hasDataVersion", 1)
                .append("childTaskAlivePeriod", 1)
                .append("parentId", 1)
                .append("createdTime", 1)
                .append("lastModifiedTime", 1)
                .append("lastCollectValue", 1)
                .append("lastPushValue", 1)
                .append("expireFlag", 1)
                .append("nextCollectTime", 1)
                .append("expireTime", $add)
        );

        // $match
        BasicDBList orList = new BasicDBList();
        BasicDBObject expireFlag = new BasicDBObject("expireFlag", new BasicDBObject("$eq", true));
        BasicDBList match$and = new BasicDBList();
        match$and.add(new BasicDBObject("collectState", new BasicDBObject("$eq", 1)));
        match$and.add(new BasicDBObject("nextCollectTime", new BasicDBObject("$lt", now)));
        BasicDBObject $and = new BasicDBObject("$and", match$and);

        BasicDBObject expireTime = new BasicDBObject("expireTime", new BasicDBObject("$lt", now));
        orList.add(expireFlag);
        orList.add($and);
        orList.add(expireTime);

        BasicDBObject $or = new BasicDBObject("$or", orList);

        BasicDBObject $match = new BasicDBObject("$match", $or);

        List<DBObject> dbStages = Arrays.asList($project, $match);

        AggregationOptions build = AggregationOptions.builder().outputMode(AggregationOptions.OutputMode.CURSOR).build();

        // 获取数据
        Cursor cursor = dbCollection.aggregate(dbStages, build);
        
        // 组装数据
        List<RefreshTaskDO> refreshTaskDOS = new ArrayList<>();
        while(cursor.hasNext()) {
            RefreshTaskDO refreshTaskDO = new RefreshTaskDO();
            DBObject next = cursor.next();
            try {
                dbObject2RefreshTaskDO(next, refreshTaskDO);
            } catch (Exception e) {
                LogUtils.error(e, log, "transfer to bean error - scope:%s, uri:%s", next.get("scope"), next.get("uri"));
            }
            refreshTaskDOS.add(refreshTaskDO);
        }

确实还行,也就几十行代码的事,顺便说一下,遇到这种复杂的mongo语句,用翻译成代码的时候,可以把最后生成的stage list(在上面是dbStages)打印成字符串查看生成的mongo语句,效率超高!顺便记录一下,在这踩到的坑:鱿鱼公司使用的mongodb库的版本是3.6的,引的driver是3.4的
在这里插入图片描述
但是com.mongodb.AggregationOptions中的Builder默认的OutputMode是OutputMode.INLINE,所以查证多方资料,在3.5以上的mongo好像就会报错如下,只要把OutputMode改成OutputMode.CURSOR就可以了。

com.mongodb.MongoCommandException:
Command failed with error 9: ‘The ‘cursor’ option is required, except for aggregate with the explain argument’ on server. The full response is { “ok” : 0.0, “errmsg” : “The ‘cursor’ option is required, except for aggregate with the explain argument”, “code” : 9, “codeName” : “FailedToParse” }at com.mongodb.connection.ProtocolHelper.getCommandFailureException(ProtocolHelper.java:115)
at com.mongodb.connection.CommandProtocol.execute(CommandProtocol.java:114)
at com.mongodb.connection.DefaultServer$DefaultServerProtocolExecutor.execute(DefaultServer.java:168)
at com.mongodb.connection.DefaultServerConnection.executeProtocol(DefaultServerConnection.java:289)
at com.mongodb.connection.DefaultServerConnection.command(DefaultServerConnection.java:176)
at com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:216)
at com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:207)
at com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:113)
at com.mongodb.operation.AggregateOperation$1.call(AggregateOperation.java:257)
at com.mongodb.operation.AggregateOperation$1.call(AggregateOperation.java:253)
at com.mongodb.operation.OperationHelper.withConnectionSource(OperationHelper.java:435)
at com.mongodb.operation.OperationHelper.withConnection(OperationHelper.java:408)
at com.mongodb.operation.AggregateOperation.execute(AggregateOperation.java:253)
at com.mongodb.operation.AggregateOperation.execute(AggregateOperation.java:67)
at com.mongodb.Mongo.execute(Mongo.java:836)
at com.mongodb.Mongo$2.execute(Mongo.java:823)
at com.mongodb.DBCollection.aggregate(DBCollection.java:1455)
at com.mongodb.DBCollection.aggregate(DBCollection.java:1418)
at com.mongodb.DBCollection.aggregate(DBCollection.java:1403)

言归正传:这样拿出来的就是是一个Cursor对象,是Iterator的子类,所以按照Iterator的方式读取对象,但是问题来了,cursor.next()返回的都是DbObject对象,然后我就开始寻找DbObject2Bean的方法。最后发现,清一色的结果:

/** 
   * 把DBObject转换成bean对象 
   * @param dbObject 
   * @param bean 
   * @return 
   * @throws IllegalAccessException 
   * @throws InvocationTargetException 
   * @throws NoSuchMethodException 
   */  
  public static <T> T dbObject2Bean(DBObject dbObject, T bean) throws IllegalAccessException,  
      InvocationTargetException, NoSuchMethodException {  
    if (bean == null) {  
      return null;  
    }  
    Field[] fields = bean.getClass().getDeclaredFields();  
    for (Field field : fields) {  
      String varName = field.getName();  
      Object object = dbObject.get(varName);  
      if (object != null) {  
        BeanUtils.setProperty(bean, varName, object);  
      }  
    }  
    return bean;  
  }  

这个引用的是org.apache.commons.beanutils.BeanUtils
简单是简单,如果为此引入一个依赖包,我是拒绝的,虽然我也不知道原因~因此我还换了一种方式:自己手动get-dbObject的内容,set-refreshTaskDO的每个字段。。(这样做真的很noob)。。但是,dbObject.get(key)这个方法针对引用另一个document的内容就不能直接set到bean里边了,于是我寻找另一种方式,最后的最后…找到一个能直接解析成bean的方法。

终极做法

先来感受一下springdata的魅力:

// aggregation打印出来:发现生成的mongo语句
{
	"aggregate": "RefreshTask",
	"pipeline": [{
		"$project": {
			"_id": 1,
			"uri": 1,
			"scope": 1,
			"desc": 1,
			"pushState": 1,
			"collectState": 1,
			"collectInterval": 1,
			"lastCollectTime": 1,
			"refreshRequestHeaders": 1,
			"effectivePercent": 1,
			"hasDataVersion": 1,
			"childTaskAlivePeriod": 1,
			"parentId": 1,
			"createdTime": 1,
			"lastModifiedTime": 1,
			"lastCollectValue": 1,
			"lastPushValue": 1,
			"expireFlag": 1,
			"nextCollectTime": 1,
			"expireTime": {
				"$add": ["$createdTime",
				{
					"$multiply": ["$childTaskAlivePeriod", 1000]
				}]
			}
		}
	},
	{
		"$match": {
			"$or": [{
				"expireFlag": true
			},
			{
				"expireTime": {
					"$lt": {
						"$date": "2018-12-27T08:51:59.845Z"
					}
				}
			},
			{
				"$and": [{
					"collectState": 1
				},
				{
					"nextCollectTime": {
						"$lt": {
							"$date": "2018-12-27T08:51:59.845Z"
						}
					}
				}]
			}]
		}
	}],
	"cursor": {
		"batchSize": 2147483647
	}
}
import org.springframework.data.mongodb.core.*;
		
		Aggregation aggregation = Aggregation.newAggregation(
                Aggregation.project("_id","uri","scope","desc","pushState","collectState","collectInterval","lastCollectTime","refreshRequestHeaders","effectivePercent","hasDataVersion","childTaskAlivePeriod","parentId","createdTime","lastModifiedTime","lastCollectValue","lastPushValue","expireFlag","nextCollectTime")
                        .andExpression("createdTime + childTaskAlivePeriod * 1000").as("expireTime"),
                Aggregation.match(new Criteria().orOperator(Criteria.where("expireFlag").is(true), new Criteria().andOperator(Criteria.where("collectState").is(1), Criteria.where("nextCollectTime").lt(now)), Criteria.where("expireTime").lt(now))));
                AggregationResults<RefreshTaskDO> refreshTask = mongodb.aggregate(aggregation, "RefreshTask", RefreshTaskDO.class);
        List<RefreshTaskDO> mappedResults = refreshTask.getMappedResults();

现在终于感叹,以前用的springdata jpa的简单query by name的方式,真正遇上了复杂的语句的时候,竟然还忘了这个神奇东西。代码直接缩减到几行。

又可以愉快的玩耍了!!!

你可能感兴趣的:(技术记录)