MongoDB高级聚合查询(单用途聚合、聚合管道)

一、MongoDB聚合

聚合操作处理数据记录并返回计算结果。聚合操作将来自多个文档的值组合在一起,并且可以对分组数据执行各种操作以返回单个结果。MongoDB提供了三种执行聚合的方式:聚合管道,map-reduce函数和单用途聚合方法。
1、单用途聚合操作
MongoDB了提供db.collection.estimatedDocumentCount(), db.collection.count()和db.collection.distinct()。

所有这些操作都聚合来自单个集合的文档。虽然这些操作提供了对公共聚合过程的简单访问,但它们缺乏聚合管道和map-reduce的灵活性和功能。
MongoDB高级聚合查询(单用途聚合、聚合管道)_第1张图片
(1)db.collection.estimatedDocumentCount()
返回集合或视图中所有文档的计数
示例:检索orders集合中所有文档的计数

> db.orders.find()			//orders集合
{ "_id" : ObjectId("5d3e560d55ad906481cb2ecc"), "cust_id" : "abc123", "status" : "A", "price" : 50, "items" : [ { "sku" : "xxx", "qty" : 25, "price" : 1 }, { "sku" : "yyy", "qty" : 25, "price" : 1 } ] }
{ "_id" : ObjectId("5d3e570e55ad906481cb2ecd"), "cust_id" : "def456", "status" : "A", "price" : 50, "items" : [ { "sku" : "zzz", "qty" : 25, "price" : 1 }, { "sku" : "www", "qty" : 25, "price" : 1 } ] }
{ "_id" : ObjectId("5d3e573955ad906481cb2ece"), "cust_id" : "ghi789", "status" : "B", "price" : 100, "items" : [ { "sku" : "xxx", "qty" : 25, "price" : 1 }, { "sku" : "yyy", "qty" : 25, "price" : 1 } ] }
{ "_id" : ObjectId("5d3e587855ad906481cb2ecf"), "cust_id" : "abc123", "status" : "A", "price" : 70, "items" : [ { "sku" : "xxx", "qty" : 25, "price" : 1 }, { "sku" : "yyy", "qty" : 25, "price" : 1 } ] }

> db.orders.estimatedDocumentCount({})
4

(2)db.collection.count()
返回与find()集合或视图的查询匹配的文档计数 。该 db.collection.count()方法不执行 find()操作,而是计算并返回与查询匹配的结果数。
需要注意的是,在分片群集上,如果存在孤立文档或 正在进行块迁移,则db.collection.count()没有查询谓词可能导致计数 不准确。要避免这些情况,请在分片群集上使用 db.collection.aggregate()方法。
示例:计算orders集合中的所有文档

> db.orders.count()
4

count()等同于 db.collection.find(query).count()构造。以上操作等同于:

> db.orders.find().count()
4

示例:计算与查询匹配的所有文档

> db.orders.count({price : {$gt : 50}})
2
> db.orders.find({price : {$gt : 50}}).count()			//这两个查询等效
2

(3)db.collection.distinct()
在单个集合或视图中查找指定字段的不同值,并在数组中返回结果。
在这里插入图片描述

> db.orders.distinct("cust_id")		//返回不同cust_id的数组
[ "abc123", "def456", "ghi789" ]

> db.orders.distinct("items.sku")		//返回items字段中嵌入字段sku的数组
[ "xxx", "yyy", "www", "zzz" ]

> db.orders.distinct("price",{status : "A"})	//返回status字段为“A”,price字段不同值的数组
[ 50, 70 ]

2、Aggregation Pipeline聚合管道
db.collection.aggregate()是基于数据处理的聚合管道,每个文档通过一个由多个阶段(stage)组成的管道,可以对每个阶段的管道进行分组、过滤等功能,然后经过一系列的处理,输出相应的结果。

通过这张图,可以了解Aggregate处理的过程:
MongoDB高级聚合查询(单用途聚合、聚合管道)_第2张图片
在这张图中:
第一阶段:$match阶段按status字段过滤文档,并将那些status等于A的文档传递给下一阶段;

第二阶段:$group阶段按cust_id字段对文档进行分组,以计算每个唯一cust_id的数量总和。

aggregate常用pipeline stage介绍:
(1)$count
返回包含输入到stage的文档的计数,理解为返回与表或视图的find()查询匹配的文档的计数。db.collection.count()方法不执行find()操作,而是计数并返回与查询匹配的结果数。
在这里插入图片描述

> db.orders.find()		//演示集合
{ "_id" : ObjectId("5d3e560d55ad906481cb2ecc"), "cust_id" : "abc123", "status" : "A", "price" : 50, "items" : [ { "sku" : "xxx", "qty" : 25, "price" : 1 }, { "sku" : "yyy", "qty" : 25, "price" : 1 } ] }
{ "_id" : ObjectId("5d3e570e55ad906481cb2ecd"), "cust_id" : "def456", "status" : "A", "price" : 50, "items" : [ { "sku" : "zzz", "qty" : 25, "price" : 1 }, { "sku" : "www", "qty" : 25, "price" : 1 } ] }
{ "_id" : ObjectId("5d3e573955ad906481cb2ece"), "cust_id" : "ghi789", "status" : "B", "price" : 100, "items" : [ { "sku" : "xxx", "qty" : 25, "price" : 1 }, { "sku" : "yyy", "qty" : 25, "price" : 1 } ] }
{ "_id" : ObjectId("5d3e587855ad906481cb2ecf"), "cust_id" : "abc123", "status" : "A", "price" : 70, "items" : [ { "sku" : "xxx", "qty" : 25, "price" : 1 }, { "sku" : "yyy", "qty" : 25, "price" : 1 } ] }

> db.orders.aggregate([
... {$match : {price : {$gt:50}}},	//$match 阶段排除price小于等于50的文档,符合的文档传到下个阶段
... {$count : "High_price"}	//$count阶段返回聚合管道中剩余文档的计数,并将该值分配给名为High_price的字段
... ])
{ "High_price" : 2 }

该$count阶段相当于以下 $group+ $project序列:

> db.orders.aggregate([
... {$group : {_id : null,MyCount : {$sum : 1}}},
... {$project : {_id : 0}}
... ])
{ "MyCount" : 4 }

(2) $group

按指定的表达式对文档进行分组,并将每个不同分组的文档输出到下一个阶段。输出文档包含一个_id字段,还可以包含计算字段,该字段保存由$group的_id字段分组的一些accumulator表达式的值。 $group不会输出具体的文档而只是统计信息。
在这里插入图片描述

  • _id字段是必填的,也可以指定_id值为null来为整个输入文档计算累计值。
  • 剩余的计算字段是可选的,并使用运算符进行计算。

accumulator操作符:

名称 描述
$avg 计算均值
$first 返回每组第一个文档,如果有排序,按照排序,如果没有按照默认的存储的顺序的第一个文档。
$last 返回每组最后一个文档,如果有排序,按照排序,如果没有按照默认的存储的顺序的最后个文档。
$max 根据分组,获取集合中所有文档对应值得最大值。
$min 根据分组,获取集合中所有文档对应值得最小值。
$push 将指定的表达式的值添加到一个数组中。
$addToSet 将表达式的值添加到一个集合中(无重复值,无序)。
$sum 计算总和

示例1:$group阶段按月份,日期和年份对文档进行分组,并计算total price和average quantity,并计算每个组的文档数量:

> db.test.find()			//示例集合test
{ "_id" : 1, "item" : "abc", "price" : 10, "quantity" : 2, "date" : ISODate("2014-03-01T08:00:00Z") }
{ "_id" : 2, "item" : "jkl", "price" : 20, "quantity" : 1, "date" : ISODate("2014-03-01T09:00:00Z") }
{ "_id" : 3, "item" : "xyz", "price" : 5, "quantity" : 10, "date" : ISODate("2014-03-15T09:00:00Z") }
{ "_id" : 4, "item" : "xyz", "price" : 5, "quantity" : 20, "date" : ISODate("2014-04-04T11:21:39.736Z") }
{ "_id" : 5, "item" : "abc", "price" : 10, "quantity" : 10, "date" : ISODate("2014-04-04T21:23:13.331Z") }

> db.test.aggregate(
...  [
...       {
...         $group : {
...            _id : { month: { $month: "$date" }, day: { $dayOfMonth: "$date" }, year: { $year: "$date" } },
...            totalPrice: { $sum: { $multiply: [ "$price", "$quantity" ] } },
...            averageQuantity: { $avg: "$quantity" },
...            count: { $sum: 1 }
...         }
...       }
...    ]
... )
{ "_id" : { "month" : 4, "day" : 4, "year" : 2014 }, "totalPrice" : 200, "averageQuantity" : 15, "count" : 2 }
{ "_id" : { "month" : 3, "day" : 15, "year" : 2014 }, "totalPrice" : 50, "averageQuantity" : 10, "count" : 1 }
{ "_id" : { "month" : 3, "day" : 1, "year" : 2014 }, "totalPrice" : 40, "averageQuantity" : 1.5, "count" : 2 }

示例2: group null , 以下聚合操作将指定组_id为null,计算集合中所有文档的总价格和平均数量以及计数:

> db.test.aggregate(
... [
...       {
...         $group : {
...            _id : null,
...            totalPrice: { $sum: { $multiply: [ "$price", "$quantity" ] } },
...            averageQuantity: { $avg: "$quantity" },
...            count: { $sum: 1 }
...         }
...       }
...    ]
... )
{ "_id" : null, "totalPrice" : 290, "averageQuantity" : 8.6, "count" : 5 }

示例3:检索不同值

> db.test.aggregate([{$group : {_id:"$item"}}])
{ "_id" : "xyz" }
{ "_id" : "jkl" }
{ "_id" : "abc" }

示例4:数据转换,将books集合中数据转化为具有按作者分组的标题

> db.books.find()
{ "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 }
{ "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 }
{ "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 }
{ "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 }
{ "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 }

> db.books.aggregate(
...    [
...      { $group : { _id : "$author", books: { $push: "$title" } } }
...    ]
... )
{ "_id" : "Homer", "books" : [ "The Odyssey", "Iliad" ] }
{ "_id" : "Dante", "books" : [ "The Banquet", "Divine Comedy", "Eclogues" ] }

示例5:使用$$ROOT 系统变量按作者对文档进行分组,生成的文档不得超过BSON文档大小限制。

> db.books.aggregate(
...    [
...      { $group : { _id : "$author", books: { $push: "$$ROOT" } } }
...    ]
... )
{ "_id" : "Homer", "books" : [ { "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 }, { "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 } ] }
{ "_id" : "Dante", "books" : [ { "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 }, { "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 }, { "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 } ] }

(3) $match
过滤文档,将符合指定条件的文档传递到下一个管道阶段。
在这里插入图片描述
$match接受一个指定查询条件的文档, $match不接受原始聚合表达式。

管道优化:$match用于对文档进行筛选,之后可以在得到的文档子集上做聚合,$match可以使用除了地理空间之外的所有常规查询操作符,在实际应用中尽可能将$match放在管道的前面位置。这样有两个好处:一是可以快速将不需要的文档过滤掉,以减少管道的工作量;二是如果再投射和分组之前执行$match,查询可以使用索引。

示例1:match做简单的匹配查询

> db.articles.find()
{ "_id" : ObjectId("5d3ac60cb3a8911f2c1a70db"), "author" : "Dave", "score" : 80, "views" : 100 }
{ "_id" : ObjectId("5d3ac617b3a8911f2c1a70dc"), "author" : "Dave", "score" : 85, "views" : 521 }
{ "_id" : ObjectId("5d3ac62ab3a8911f2c1a70dd"), "author" : "Ahn", "score" : 60, "views" : 1000 }
{ "_id" : ObjectId("5d3ac63bb3a8911f2c1a70de"), "author" : "li", "score" : 55, "views" : 5000 }
{ "_id" : ObjectId("5d3ac64bb3a8911f2c1a70df"), "author" : "annT", "score" : 60, "views" : 50 }
{ "_id" : ObjectId("5d3ac65fb3a8911f2c1a70e0"), "author" : "li", "score" : 94, "views" : 999 }
{ "_id" : ObjectId("5d3ac66fb3a8911f2c1a70e1"), "author" : "Ty", "score" : 95, "views" : 1000 }

> db.articles.aggregate(
... [{$match : {author : "Dave"}}]
... )
{ "_id" : ObjectId("5d3ac60cb3a8911f2c1a70db"), "author" : "Dave", "score" : 80, "views" : 100 }
{ "_id" : ObjectId("5d3ac617b3a8911f2c1a70dc"), "author" : "Dave", "score" : 85, "views" : 521 }

示例2:使用$match管道选择要处理的文档,然后将结果输出到$group管道以计算文档的计数($match选择score大于70小于90或views大于等于1000的文件,通过管道送往$group计数):

> db.articles.aggregate( [
...   { $match: { $or: [ { score: { $gt: 70, $lt: 90 } }, { views: { $gte: 1000 } } ] } },
...   { $group: { _id: null, count: { $sum: 1 } } }
... ] )
{ "_id" : null, "count" : 5 }

(4) $unwind
从输入文档解构数组字段以输出每个元素的文档。每个输出文档都是输入文档,其中数组字段的值由元素替换。
在这里插入图片描述
示例1:$unwind为sizes数组中的每个元素输出一个文档:

> db.inventory.find()
{ "_id" : 1, "item" : "ABC1", "sizes" : [ "S", "M", "L" ] }

> db.inventory.aggregate([{$unwind : "$sizes"}])
{ "_id" : 1, "item" : "ABC1", "sizes" : "S" }
{ "_id" : 1, "item" : "ABC1", "sizes" : "M" }
{ "_id" : 1, "item" : "ABC1", "sizes" : "L" }

3.2+版本支持的新功能:
MongoDB高级聚合查询(单用途聚合、聚合管道)_第3张图片

path 数组字段的字段路径。要指定字段路径,请在字段名称前加上美元符号$,并用引号括起来。
includeArrayIndex 可选的。用于保存元素的数组索引的新字段的名称。该名称不能以美元符号开头$。
preserveNullAndEmptyArrays 可选的。如果true,如果path为null、缺少或为空数组,则 $unwind输出文档。如果false,$unwind如果path为null,缺少或空数组, 则不输出文档。默认值为false。

示例2:以下$unwind操作是等效的,并返回sizes字段中每个元素的文档

> db.inventory.find()
{ "_id" : 1, "item" : "ABC", "sizes" : [ "S", "M", "L" ] }
{ "_id" : 2, "item" : "EFG", "sizes" : [ ] }
{ "_id" : 3, "item" : "IJK", "sizes" : "M" }
{ "_id" : 4, "item" : "LMN" }
{ "_id" : 5, "item" : "XYZ", "sizes" : null }

> db.inventory.aggregate( [ { $unwind: "$sizes" } ]
... )
{ "_id" : 1, "item" : "ABC", "sizes" : "S" }
{ "_id" : 1, "item" : "ABC", "sizes" : "M" }
{ "_id" : 1, "item" : "ABC", "sizes" : "L" }
{ "_id" : 3, "item" : "IJK", "sizes" : "M" }

> db.inventory.aggregate( [ { $unwind: { path: "$sizes" } } ] )		//与上一个操作等价
{ "_id" : 1, "item" : "ABC", "sizes" : "S" }
{ "_id" : 1, "item" : "ABC", "sizes" : "M" }
{ "_id" : 1, "item" : "ABC", "sizes" : "L" }
{ "_id" : 3, "item" : "IJK", "sizes" : "M" }

示例3:$unwind操作使用includeArrayIndex选项来输出数组元素的数组索引。

> db.inventory.aggregate( [ { $unwind: { path: "$sizes", includeArrayIndex: "arrayIndex" } } ] )
{ "_id" : 1, "item" : "ABC", "sizes" : "S", "arrayIndex" : NumberLong(0) }
{ "_id" : 1, "item" : "ABC", "sizes" : "M", "arrayIndex" : NumberLong(1) }
{ "_id" : 1, "item" : "ABC", "sizes" : "L", "arrayIndex" : NumberLong(2) }
{ "_id" : 3, "item" : "IJK", "sizes" : "M", "arrayIndex" : null }

示例3:$unwind操作使用preserveNullAndEmptyArrays选项在输出中包含缺少size字段,null或空数组的文档。

> db.inventory.aggregate( [
...    { $unwind: { path: "$sizes", preserveNullAndEmptyArrays: true } }
... ] )
{ "_id" : 1, "item" : "ABC", "sizes" : "S" }
{ "_id" : 1, "item" : "ABC", "sizes" : "M" }
{ "_id" : 1, "item" : "ABC", "sizes" : "L" }
{ "_id" : 2, "item" : "EFG" }
{ "_id" : 3, "item" : "IJK", "sizes" : "M" }
{ "_id" : 4, "item" : "LMN" }
{ "_id" : 5, "item" : "XYZ", "sizes" : null }

(5) $project
将包含请求字段的文档传递到管道中的下一个阶段。指定的字段可以是输入文档或新计算字段中的现有字段。
在这里插入图片描述
$project规范有以下形式:

< field>: < 1 or true> 指定包含字段。
_id: <0 or false> 指定_id字段的抑制。默认情况下,_id字段包含在输出文档中。
< field>: < expression> 添加新字段或重置现有字段的值。 在版本3.6中更改:MongoDB 3.6添加变量REMOVE。如果表达式的计算结果为$$REMOVE,则该字段将排除在输出中。
< field>:<0 or false> 指定排除字段。

示例:$project阶段的输出文档中包含特定字段:

> db.book.find()
{ "_id" : 1, "title" : "abc123", "isbn" : "0001122223334", "author" : { "last" : "zzz", "first" : "aaa" }, "copies" : 5 }

> db.book.aggregate([
... {$project : {title : 1,author : 1}}
... ])
{ "_id" : 1, "title" : "abc123", "author" : { "last" : "zzz", "first" : "aaa" } }

> db.book.aggregate([ {$project : {_id : 0,title : 1,author : 1}} ])		//抑制输出文档中_id字段
{ "title" : "abc123", "author" : { "last" : "zzz", "first" : "aaa" } }

示例2:从输出文档中排除特定字段

> db.book.find().pretty()
{
        "_id" : 1,
        "title" : "abc123",
        "isbn" : "0001122223334",
        "author" : {
                "last" : "zzz",
                "first" : "aaa"
        },
        "copies" : 5,
        "lastModified" : "2016-07-28"
}

> db.book.aggregate([
... {$project : {"lastModified" : 0}}
... ])			//排除lastModified字段
{ "_id" : 1, "title" : "abc123", "isbn" : "0001122223334", "author" : { "last" : "zzz", "first" : "aaa" }, "copies" : 5 }

> db.book.aggregate([ {$project : {"author.first" : 0,"lastModified" : 0}} ])		//从嵌入式文档中排除字段
{ "_id" : 1, "title" : "abc123", "isbn" : "0001122223334", "author" : { "last" : "zzz" }, "copies" : 5 }

> db.book.aggregate([ {$project : {"author" : {"first" : 0},"lastModified" : 0}} ])		//将排除规范嵌套的文档中
{ "_id" : 1, "title" : "abc123", "isbn" : "0001122223334", "author" : { "last" : "zzz" }, "copies" : 5 }

从MongoDB 3.6开始,可以在聚合表达式中使用变量REMOVE来有条件地禁止一个字段。
示例3:$project阶段使用REMOVE变量来排除author.middle字段,前提是它等于""

> db.book.find().pretty()
{
        "_id" : 1,
        "title" : "abc123",
        "isbn" : "0001122223334",
        "author" : {
                "last" : "zzz",
                "first" : "aaa"
        },
        "copies" : 5,
        "lastModified" : "2016-07-28"
}
{
        "_id" : 2,
        "title" : "Baked Goods",
        "isbn" : "9999999999999",
        "author" : {
                "last" : "xyz",
                "first" : "abc",
                "middle" : ""
        },
        "copies" : 2,
        "lastModified" : "2017-07-21"
}
{
        "_id" : 3,
        "title" : "Ice Cream Cakes",
        "isbn" : "8888888888888",
        "author" : {
                "last" : "xyz",
                "first" : "abc",
                "middle" : "mmm"
        },
        "copies" : 5,
        "lastModified" : "2017-07-22"
}

> db.book.aggregate([
... {
...       $project: {
...          title: 1,
...          "author.first": 1,
...          "author.last" : 1,
...          "author.middle": {
...             $cond: {
...                if: { $eq: [ "", "$author.middle" ] },
...                then: "$$REMOVE",
...                else: "$author.middle"
...             }
...          }
...       }
...    }
... ] )
{ "_id" : 1, "title" : "abc123", "author" : { "last" : "zzz", "first" : "aaa" } }
{ "_id" : 2, "title" : "Baked Goods", "author" : { "last" : "xyz", "first" : "abc" } }
{ "_id" : 3, "title" : "Ice Cream Cakes", "author" : { "last" : "xyz", "first" : "abc", "middle" : "mmm" } }

投影出新数组字段
示例4:下面的聚合操作将返回新的数组字段MyArray

> db.collection.find()
{ "_id" : ObjectId("5d3ea6a4a76866b3f79f0754"), "x" : 1, "y" : 1 }

> db.collection.aggregate([{$project : {MyArray : ["$x","$y"]}}])
{ "_id" : ObjectId("5d3ea6a4a76866b3f79f0754"), "MyArray" : [ 1, 1 ] }

//如果返回的数组包含了不存在的字段,则会返回null:
> db.collection.aggregate([{$project : {MyArray : ["$x","$y","$testField"]}}])
{ "_id" : ObjectId("5d3ea6a4a76866b3f79f0754"), "MyArray" : [ 1, 1, null ] }

(6) $limit
限制传递到管道中下一个阶段的文档数 。
在这里插入图片描述
示例:

> db.article.aggregate(
...     { $limit : 5 }
... )

此操作仅返回管道传递给它的前5个文档。$limit对其通过的文件的内容没有影响。
注:当$sort在管道中的$limit之前立即出现时,$sort操作只会在过程中维持前n个结果,其中n是指定的限制,而MongoDB只需要将n个项存储在内存中。当allowDiskUse为true并且n个项目超过聚合内存限制时,此优化仍然适用。
(7) $skip
跳过进入stage的指定数量的文档,并将其余文档传递到管道中的下一个阶段
在这里插入图片描述
示例:

> db.article.aggregate(
...     { $skip : 5 }
... )

此操作会跳过管道传递给它的前5个文档。$skip对通过管道的文件内容没有影响。
(8) $sort
对所有输入文档进行排序,并按排序顺序将它们返回到管道。
在这里插入图片描述
示例:

> db.users.aggregate(
...    [
...      { $sort : { age : -1, posts: 1 } }
...    ]
... )
{ "_id" : ObjectId("5d3eaa8aa76866b3f79f0757"), "name" : "lala", "age" : 20, "posts" : 600 }
{ "_id" : ObjectId("5d3eaa7aa76866b3f79f0756"), "name" : "hehe", "age" : 19, "posts" : 0 }
{ "_id" : ObjectId("5d3eaa6ea76866b3f79f0755"), "name" : "haha", "age" : 19, "posts" : 710 }

比较不同BSON类型的值时,MongoDB使用以下比较顺序,从最低到最高:
MongoDB高级聚合查询(单用途聚合、聚合管道)_第4张图片

> db.users.aggregate(
...    [
...      { $sort : { age : -1, posts: 1 } }
...    ]
... )
{ "_id" : ObjectId("5d3eaae7a76866b3f79f0758"), "name" : "hehe", "age" : "asdf", "posts" : 0 }		//字符串高于数字
{ "_id" : ObjectId("5d3eaa8aa76866b3f79f0757"), "name" : "lala", "age" : 20, "posts" : 600 }
{ "_id" : ObjectId("5d3eaa7aa76866b3f79f0756"), "name" : "hehe", "age" : 19, "posts" : 0 }
{ "_id" : ObjectId("5d3eaa6ea76866b3f79f0755"), "name" : "haha", "age" : 19, "posts" : 710 }

(9)$sortByCount
3.4版本新增。根据指定表达式的值对传入文档分组,然后计算每个不同组中文档的数量。每个输出文档都包含两个字段:包含不同分组值的_id字段和包含属于该分组或类别的文档数的计数字段,文件按降序排列。
在这里插入图片描述
该$sortByCount阶段相当于以下$group+ $sort序列:

{  $ group : {_  id : < expression > , count : {  $ sum : 1  }  }  },
{  $ sort : {  count : - 1  }  }

示例:$unwind解构数组tags,$sortByCount计算与每个标签相关联的文档的数量并按降序排列

> db.exhibits.find()
{ "_id" : 1, "title" : "The Pillars of Society", "artist" : "Grosz", "year" : 1926, "tags" : [ "painting", "satire", "Expressionism", "caricature" ] }
{ "_id" : 2, "title" : "Melancholy III", "artist" : "Munch", "year" : 1902, "tags" : [ "woodcut", "Expressionism" ] }
{ "_id" : 3, "title" : "Dancer", "artist" : "Miro", "year" : 1925, "tags" : [ "oil", "Surrealism", "painting" ] }
{ "_id" : 4, "title" : "The Great Wave off Kanagawa", "artist" : "Hokusai", "tags" : [ "woodblock", "ukiyo-e" ] }
{ "_id" : 5, "title" : "The Persistence of Memory", "artist" : "Dali", "year" : 1931, "tags" : [ "Surrealism", "painting", "oil" ] }
{ "_id" : 6, "title" : "Composition VII", "artist" : "Kandinsky", "year" : 1913, "tags" : [ "oil", "painting", "abstract" ] }
{ "_id" : 7, "title" : "The Scream", "artist" : "Munch", "year" : 1893, "tags" : [ "Expressionism", "painting", "oil" ] }
{ "_id" : 8, "title" : "Blue Flower", "artist" : "O'Keefe", "year" : 1918, "tags" : [ "abstract", "painting" ] }

> db.exhibits.aggregate([
... {$unwind : "$tags"},
... {$sortByCount : "$tags"}
... ])
{ "_id" : "painting", "count" : 6 }
{ "_id" : "oil", "count" : 4 }
{ "_id" : "Expressionism", "count" : 3 }
{ "_id" : "abstract", "count" : 2 }
{ "_id" : "Surrealism", "count" : 2 }
{ "_id" : "ukiyo-e", "count" : 1 }
{ "_id" : "woodblock", "count" : 1 }
{ "_id" : "woodcut", "count" : 1 }
{ "_id" : "satire", "count" : 1 }

你可能感兴趣的:(MongoDB)