mongodb 高级语法

mongo一些基本语法概念:

  1. aggregate方法中的本质是一系列的 pipeline, 会按定义的顺序一个一个串联执行,前一个pipeline的结果是后一个pipeline的参数.
  2. 字段与字段的比较需要使用特殊手段,如$redact$expr
  3. 时间戳格式化需要用到$dateToString,需要注意时区问题.

前置条件:

  1. mongodb版本: 4.2.1
  2. collection数据结构: t_wechat_user
{
  "_id": "ObjectId(\"5f5f3ef2b53b633689108dfs\")",
  "open_id": "o69xlwMzwkhTIlYoFGEWeHzUtles",
  "app_id": "wxa82301b25sdf0153",
  "subscribe_time": "NumberLong(1578880338000)",
  "custom_time": "NumberLong(1638328107017)",
  "subject": "游戏原画",
  "create_time": "ISODate(\"2020-09-14T09:59:14.955Z\")",
  "update_time": "ISODate(\"2021-12-01T05:37:35.962Z\")"
}

custom_time 和 subscribe_time 类型为时间戳毫秒数
正常情况 subscribe_time 一般在 custom_time 之前
但 subscribe_time 也有可能在 custom_time 之后

需求1:

按 subject 统计某个custom_time区间的 custom_time 在 subscribe_time 之后且不超过7天的间隔时间平均值,翻译成sql语言类似下面这样

SELECT subject, avg( custom_time - subscribe_time ) from t_wechat_user
WHERE
    custom_time BETWEEN 1635696000000 AND 1638288000000 
    AND custom_time > subscribe_time 
    AND ( custom_time - subscribe_time ) < 604800000 
GROUP BY subject

对应mongo写法如下

db.t_wechat_user.aggregate([{
    $match: {
        custom_time: {
            "$gte": 1635696000000,
            "$lt": 1638288000000
        }
    }
}, {
    $redact: {
        $cond: {
            if : {
               $and:[{"$gt":["$custom_time","$subscribe_time"]}, {"$lt": [{"$subtract": ["$custom_time", 604800000]}, "$subscribe_time"]}]
            },
            then: "$$KEEP",
            else : "$$PRUNE"
        }
    }
}, {
    $project: 
    {
        custom_time: 1,
        subject: 1,
        "subTime": {
            "$subtract": ["$custom_time", "$subscribe_time"]
        }
    }
}, {
    $group: {
        _id: "$subject",
        myCount: { $sum: 1 },
        subTimeAvg: {
            $avg: "$subTime"
        }
    }
}]);

需求2:

按日统计某个custom_time区间的 custom_time 在 subscribe_time 之后且不超过7天的间隔时间平均值,翻译成sql语言类似下面这样

SELECT  DATE_FORMAT(custom_time,"%Y-%m-%d") AS day, avg( custom_time - subscribe_time ) from t_wechat_user
WHERE
    custom_time BETWEEN 1635696000000 AND 1638288000000 
    AND custom_time > subscribe_time 
    AND ( custom_time - subscribe_time ) < 604800000 
GROUP BY  DATE_FORMAT(custom_time,"%Y-%m-%d")

这里就涉及到如何将时间戳转换成指定的日期格式,需要用到$dateToString函数,需要注意时区问题

db.t_wechat_user.aggregate([{
    $match: {
        custom_time: {
            "$gte": 1635696000000,
            "$lt": 1638288000000
        }
    }
}, {
    $redact: {
        $cond: {
            if : {
               $and:[{"$gt":["$custom_time","$subscribe_time"]}, {"$lt": [{"$subtract": ["$custom_time", 604800000]}, "$subscribe_time"]}]
            },
            then: "$$KEEP",
            else : "$$PRUNE"
        }
    }
}, {
    $project: 
    {
        custom_time: 1,
        subject: 1,
        "subTime": {
            "$subtract": ["$custom_time", "$subscribe_time"]
        }
    }
}, {
    $group: {
        _id: { $dateToString: { format: "%Y-%m-%d", date:{$add:[ISODate("1970-01-01T00:00:00Z"),"$custom_time"]},timezone: "+08:00" }},
        myCount: { $sum: 1 },
        subTimeAvg: {
            $avg: "$subTime"
        }
    }
}]);

需求3:

如果我只是想查询出这些数据,而不是分组统计呢?

SELECT open_id, subject, app_id, subscribe_time, custom_time  FROM t_wechat_user 
WHERE
    custom_time BETWEEN 1635696000000 AND 1638288000000 
    AND custom_time > subscribe_time 
    AND ( custom_time - subscribe_time ) < 604800000

相应的mongo写法如下:

db.t_wechat_user.find({
    custom_time: {
        $gte: 1638288000000,
        $lt: 1638892800000
    },
    $expr: {
        $and: [{
            "$gt": ["$custom_time", "$subscribe_time"]
        }, {
            "$lt": [{
                "$subtract": ["$custom_time", 604800000]
            }, "$subscribe_time"]
        }]
    }
}, {
    open_id: 1,
    subject: 1,
    app_id: 1,
    subscribe_time: 1,
    custom_time: 1
});

到这里可能有人已经发现,前面两个需求其实还可以换一种写法,把$redact这一步可以并入到$match这个pipeline里面去,写法如下:

db.t_wechat_user.aggregate([{
    $match: {
        custom_time: {
            "$gte": 1635696000000,
            "$lt": 1638288000000
        },
        $expr: {
            $and: [{
                "$gt": ["$custom_time", "$subscribe_time"]
            }, {
                "$lt": [{
                    "$subtract": ["$custom_time", "$subscribe_time"]
                }, 604800000]
            }]
        }
    }
}, {
    $project: {
        custom_time: 1,
        subject: 1,
        "subTime": {
            "$subtract": ["$custom_time", "$subscribe_time"]
        }
    }
}, {
    $group: {
        _id: "$subject",
        myCount: {
            $sum: 1
        },
        subTimeAvg: {
            $avg: "$subTime"
        }
    }
}]);

你可能感兴趣的:(mongodb 高级语法)