在上一篇mongodb聚合操作之Aggregation Pipeline中详细介绍了什么是mongodb聚合操作中的Aggregation Pipeline以及参数细节。本篇将开始介绍Aggregation聚合操作中的group分组操作，相当于mysql的group by聚合。

1. 简介

说明：

按照指定的_id表达式对输入文档进行分组，并对每个不同的分组输出一个文档。每个输出文档的_id字段包含惟一的group by值。输出文档还可以包含包含某些累加器表达式值的计算字段，不对其输出文档排序

语法：

{

$group:

{

_id: , // Group By Expression

: { : },

...

}

参数讲解：

_id：必须的。如果您将_id值指定为null或任何其他常数值，则$group阶段将计算所有输入文档作为一个整体的累计值。参见Null分组示例。_id和累加器操作符可以接受任何有效的表达式

field：可选的。使用累加器操作符计算。

操作符必须是以下累加器操作符之一:

$addToSet：返回每个组的惟一表达式值数组。数组元素的顺序未定义

$avg：返回数值的平均值。忽略了非数字值。

$first：为每个组从第一个文档返回一个值。只有在文档按已定义的顺序排列时才定义顺序。

$last：从最后一个文档中为每个组返回一个值。只有在文档按已定义的顺序排列时才定义顺序。

$max：返回每个组的最大表达式值。

$min：返回每个组的最小表达式值。

$push：返回每个组的表达式值数组。

$sum：返回数值的和。忽略了非数字值。

$mergeObjects：返回通过组合每个组的输入文档创建的文档。

$stdDevPop：返回输入值的总体标准差。

$stdDevSamp：返回输入值的样本标准差。

2. 示例

初始化数据：

db.groupExample.insertMany([

{ "_id" : 1, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("2"), "date" : ISODate("2014-03-01T08:00:00Z") },

{ "_id" : 2, "item" : "jkl", "price" : NumberDecimal("20"), "quantity" : NumberInt("1"), "date" : ISODate("2014-03-01T09:00:00Z") },

{ "_id" : 3, "item" : "xyz", "price" : NumberDecimal("5"), "quantity" : NumberInt( "10"), "date" : ISODate("2014-03-15T09:00:00Z") },

{ "_id" : 4, "item" : "xyz", "price" : NumberDecimal("5"), "quantity" : NumberInt("20") , "date" : ISODate("2014-04-04T11:21:39.736Z") },

{ "_id" : 5, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("10") , "date" : ISODate("2014-04-04T21:23:13.331Z") },

{ "_id" : 6, "item" : "def", "price" : NumberDecimal("7.5"), "quantity": NumberInt("5" ) , "date" : ISODate("2015-06-04T05:08:13Z") },

{ "_id" : 7, "item" : "def", "price" : NumberDecimal("7.5"), "quantity": NumberInt("10") , "date" : ISODate("2015-09-10T08:43:00Z") },

{ "_id" : 8, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("5" ) , "date" : ISODate("2016-02-06T20:20:13Z") },

])

2.1. 计算文档总数

示例：下面的聚合操作使用$group阶段来计算groupExample集合中的文档数量:

db.groupExample.aggregate( [

{

$group: {

_id: null,

}

] )

结果：8

2.2. 对某一字段进行分组

示例：对item字段进行分组

db.groupExample.aggregate( [ { $group : { _id : "$item" } } ] )

结果：

{

"_id" : "xyz"

}

{

"_id" : "jkl"

}

{

"_id" : "def"

}

{

"_id" : "abc"

}

2.3. 对某一字段进行分组后having

示例：以下聚合操作按item字段对文档进行分组，计算每个项目的总销售额，并只返回总销售额大于或等于100的项目:

db.groupExample.aggregate(

[

{

$group :

{

_id : "$item",

totalSaleAmount: { $sum: { $multiply: [ "$price", "$quantity" ] } }

}

{

$match: { "totalSaleAmount": { $gte: 100 } }

}

]

)

以上操作相当于mysql中：SELECT item,

Sum(( price * quantity )) AS totalSaleAmount

FROM groupExample

GROUP BY item

HAVING totalSaleAmount >= 100

结果：

{ "_id" : "abc", "totalSaleAmount" : NumberDecimal("170") }

{ "_id" : "xyz", "totalSaleAmount" : NumberDecimal("150") }

{ "_id" : "def", "totalSaleAmount" : NumberDecimal("112.5") }

2.4. 计算计数、总和和平均值

示例：计算2014年的总销售额、平均销售额、每天的销售额并排序:

db.groupExample.aggregate([

// First Stage

{

$match : { "date": { $gte: new ISODate("2014-01-01"), $lt: new ISODate("2015-01-01") } }

// Second Stage

{

$group : {

_id : { $dateToString: { format: "%Y-%m-%d", date: "$date" } },

totalSaleAmount: { $sum: { $multiply: [ "$price", "$quantity" ] } },

averageQuantity: { $avg: "$quantity" },

}

// Third Stage

{

$sort : { totalSaleAmount: -1 }

}

])

以上操作相当于mysql中：

SELECT date,

Sum(( price * quantity )) AS totalSaleAmount,

Avg(quantity) AS averageQuantity,

Count(*) AS Count

FROM groupExample

GROUP BY Date(date)

ORDER BY totalSaleAmount DESC

结果：

{ "_id" : "2014-04-04", "totalSaleAmount" : NumberDecimal("200"), "averageQuantity" : 15, "count" : 2 }

{ "_id" : "2014-03-15", "totalSaleAmount" : NumberDecimal("50"), "averageQuantity" : 10, "count" : 1 }

{ "_id" : "2014-03-01", "totalSaleAmount" : NumberDecimal("40"), "averageQuantity" : 1.5, "count" : 2 }

2.5. 分组ID字段是null,计算总数

示例：下面的聚合操作将组_id指定为null，用于计算总销售额、平均数量和集合中所有文档的计数。

db.groupExample.aggregate([

{

$group : {

_id : null,

totalSaleAmount: { $sum: { $multiply: [ "$price", "$quantity" ] } },

averageQuantity: { $avg: "$quantity" },

}

])

以上操作相当于mysql中：

SELECT Sum(price * quantity) AS totalSaleAmount,

Avg(quantity) AS averageQuantity,

Count(*) AS Count

FROM groupExample

结果：

{

"_id" : null,

"totalSaleAmount" : NumberDecimal("452.5"),

"averageQuantity" : 7.875,

"count" : 8

}

2.6. 对某一个字段进行分组，并查询出分组下数据加到数组中

初始化数据：

db.books.insertMany([

{ "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 },

{ "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 },

{ "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 },

{ "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 },

{ "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 }

])

示例：

db.books.aggregate([

{ $group : { _id : "$author", books: { $push: "$title" } } }

])

以上操作相当于mysql中：SELECT author,GROUP_CONCAT(title) as books FROM books GROUP BY author

结果：

{

"_id" : "Homer",

"books" : [

"The Odyssey",

"Iliad"

]

}

{

"_id" : "Dante",

"books" : [

"The Banquet",

"Divine Comedy",

"Eclogues"

]

}

2.7. 使用$$ROOT系统变量对整个文档进行分组

初始化数据：

db.books.insertMany([

{ "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 },

{ "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 },

{ "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 },

{ "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 },

{ "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 }

])

示例：

db.books.aggregate([

// First Stage

{

$group : { _id : "$author", books: { $push: "$$ROOT" } }

// Second Stage

{

$addFields:

{

totalCopies : { $sum: "$books.copies" }

}

])

结果：

{

"_id" : "Homer",

"books" : [

{

"_id" : 7000.0,

"title" : "The Odyssey",

"author" : "Homer",

"copies" : 10.0

{

"_id" : 7020.0,

"title" : "Iliad",

"author" : "Homer",

"copies" : 10.0

}

"totalCopies" : 20.0

}

{

"_id" : "Dante",

"books" : [

{

"_id" : 8751.0,

"title" : "The Banquet",

"author" : "Dante",

"copies" : 2.0

{

"_id" : 8752.0,

"title" : "Divine Comedy",

"author" : "Dante",

"copies" : 1.0

{

"_id" : 8645.0,

"title" : "Eclogues",

"author" : "Dante",

"copies" : 2.0

}

"totalCopies" : 5.0

}

3. 注意事项

注意内存说明：

$group阶段的RAM有100mb字节的限制。默认情况下，如果阶段超过这个限制，$group将返回一个错误。要允许处理大型数据集，请将allowDiskUse选项设置为true。此标志允许$group操作写入临时文件。有关更多信息，请参见db.collection.aggregate()方法和aggregate命令。

版本2.6中的变化:MongoDB为$group阶段引入了100mb的RAM限制，并引入allowDiskUse选项来处理大型数据集的操作。

mongodb Aggregation聚合操作之group分组

1. 简介

2. 示例

2.1. 计算文档总数

2.2. 对某一字段进行分组

2.3. 对某一字段进行分组后having

2.4. 计算计数、总和和平均值

2.5. 分组ID字段是null,计算总数

2.6. 对某一个字段进行分组，并查询出分组下数据加到数组中

2.7. 使用$$ROOT系统变量对整个文档进行分组

3. 注意事项

你可能感兴趣的:(mongodb Aggregation聚合操作之group分组)