系列九、MongoDB聚合查询

一、概述

MongoDB中的聚合(aggregate)主要用于处理数据(例如:统计平均值、求和等),并返回计算后的数据结果。常见功能:

①:作用在一个或者几个集合上;

②:对集合中的数据进行一系列的运算;

③:将这些数据转化为期望的形式;

从效果而言,聚合框架相当于SQL中的 group by、left outer join、as等。

1.1、聚合管道方法

管道在Unix或者Linux中一般用于将当前指令的输出结果作为下一个命令的参数。MongoDB的聚合管道将MongoDB文档在一个管道处理完毕后将结果传递给下一个管道处理。

二、聚合

2.1、管道(Pipeline)和步骤(Stage)

整个聚合运算过程称为管道(Pipeline),它是由多个步骤(Stage)组成的,每个管道:

①:接受一系列文档(原始数据);

②:每个步骤对这些文档进行一系列运算;

③:结果文档输出给下一个步骤;

2.2、基本格式

pipline = [$stage1,$stage2,...$stageN]
 
db..aggregate(
    pipline,
    { options }
);

2.3、常见步骤

2.3.1、 常见步骤中的运算符

 2.4、少见步骤

2.5、使用场景

聚合查询可以用于OLAP和OLTP场景,例如:

2.6、MQL vs SQL

2.6.1、vs1

2.6.2、vs2

2.6.3、特有步骤$unwind

2.6.4、特有步骤$bucket

2.6.5、 特有步骤$facet

三、常见聚合表达式

系列九、MongoDB聚合查询_第1张图片

四、聚合查询案例演示

4.1、聚合查询一

4.1.1、数据初始化

use 20230620_mongodb_study
db.language.insert([
	{
		title: 'MongoDB Overview',
		description: 'MongoDB is a nosql database',
		by_user: 'runoob.com',
		url: 'http://www.runoob.com',
		tags: ['mongodb','database','NoSQL'],
		likes: 100
	},
	{
		title: 'NoSQL Overview',
		description: 'NoSQL database is very fast',
		by_user: 'runoob.com',
		url: 'http://www.runoob.com',
		tags: ['mongodb','database','NoSQL'],
		likes: 10
	},
	{
		title: 'Redis Overview',
		description: 'redis is not only sql',
		by_user: 'runoob.com',
		url: 'http://www.redis.com',
		tags: ['redis','database','NoSQL'],
		likes: 100
	},
	{
		title: 'Neo4j Overview',
		description: 'Neo4j is a nosql database',
		by_user: 'Neo4j',
		url: 'http://www.neo4j.com',
		tags: ['neo4j','database','NoSQL'],
		likes: 750
	}
])

4.1.2、统计每个作者所写的文章数

db.language.aggregate([
	{$group:{'_id':'$by_user','num_tutorial':{$sum:1}}}
])

4.1.3、统计每个作者所写文章的平均点赞数

db.language.aggregate([
	{$group:{'_id':'$by_user','avg_stars':{$avg:'$likes'}}}
])

4.1.4、统计每个作者所写文章的总点赞数

db.language.aggregate([
	{$group:{'_id':'$by_user','total_stars':{$sum:'$likes'}}}
])

4.1.5、统计每个作者所写文章的最大点赞数

db.language.aggregate([
	{$group:{'_id':'$by_user','max_stars':{$max:'$likes'}}}
])

4.1.6、统计每个作者所写文章的最小点赞数

db.language.aggregate([
	{$group:{'_id':'$by_user','max_stars':{$min:'$likes'}}}
])

4.1.7、$push

说明:将值加入一个数组中,不会判断是否有重复值

db.language.aggregate([
	{$group:{'_id':'$by_user',url:{$push:'$url'}}}
])

4.1.8、$addToSet

说明:将值加入一个数组中,会判断是否有重复的值,若相同的值在数组中已经存在了,则不加入

db.language.aggregate([
	{$group:{'_id':'$by_user',url:{$addToSet:'$url'}}}
])

4.1.9、$first

说明:根据资源文档的排序获取第一个文档数据

db.language.aggregate([
	{$group:{'_id':'$by_user',url:{$first:'$url'}}}
])

4.1.10、$last

说明:根据资源文档的排序获取最后一个文档数据

db.language.aggregate([
	{$group:{'_id':'$by_user',url:{$last:'$url'}}}
])

4.2、聚合查询二

4.2.1、数据初始化

db.orders.remove({})

db.orders.insert(
	[
		{
			"street": "西兴街道",
			"city": "杭州",
			"state": "浙江省",
			"country": "中国",
			"zip": "24344-1715",
			"phone": "18866668888",
			"name": "李白",
			"userId": "3573",
			"orderDate": "2019-01-02 03:20:08.805",
			"status": "completed",
			"shippingFee": 8.00,
			"orderLines": [{
					"product": "iPhone5",
					"sku": "2001",
					"qty": 1,
					"price": 100.00,
					"cost": 100.00
				},
				{
					"product": "iPhone5s",
					"sku": "2002",
					"qty": 2,
					"price": 200.00,
					"cost": 400.00
				},
				{
					"product": "iPhone6",
					"sku": "2003",
					"qty": 1,
					"price": 300.00,
					"cost": 300.00
				},
				{
					"product": "iPhone6s",
					"sku": "2004",
					"qty": 2,
					"price": 400.00,
					"cost": 800.00
				},
				{
					"product": "iPhone8",
					"sku": "2005",
					"qty": 2,
					"price": 500.00,
					"cost": 1000.00
				}
			],
			"total": 2600
		},
		{
			"street": "长河街道",
			"city": "杭州",
			"state": "浙江省",
			"country": "中国",
			"zip": "24344-1716",
			"phone": "18866668881",
			"name": "杜甫",
			"userId": "3574",
			"orderDate": "2019-02-02 13:20:08.805",
			"status": "completed",
			"shippingFee": 5.00,
			"orderLines": [{
					"product": "iPhone5",
					"sku": "2001",
					"qty": 1,
					"price": 100.00,
					"cost": 100.00
				},
				{
					"product": "iPhone5s",
					"sku": "2002",
					"qty": 2,
					"price": 200.00,
					"cost": 400.00
				},
				{
					"product": "iPhone6",
					"sku": "2003",
					"qty": 1,
					"price": 300.00,
					"cost": 300.00
				},
				{
					"product": "iPhone6s",
					"sku": "2004",
					"qty": 2,
					"price": 400.00,
					"cost": 800.00
				},
				{
					"product": "iPhone8",
					"sku": "2005",
					"qty": 2,
					"price": 500.00,
					"cost": 1000.00
				}
			],
			"total": 2600
		},
		{
			"street": "浦沿街道",
			"city": "杭州",
			"state": "浙江省",
			"country": "中国",
			"zip": "24344-1717",
			"phone": "18866668882",
			"name": "王安石",
			"userId": "3575",
			"orderDate": "2019-03-02 14:20:08.805",
			"status": "completed",
			"shippingFee": 20.00,
			"orderLines": [{
					"product": "iPhone5",
					"sku": "2001",
					"qty": 1,
					"price": 100.00,
					"cost": 100.00
				},
				{
					"product": "iPhone5s",
					"sku": "2002",
					"qty": 2,
					"price": 200.00,
					"cost": 400.00
				},
				{
					"product": "iPhone6",
					"sku": "2003",
					"qty": 1,
					"price": 300.00,
					"cost": 300.00
				},
				{
					"product": "iPhone6s",
					"sku": "2004",
					"qty": 2,
					"price": 400.00,
					"cost": 800.00
				},
				{
					"product": "iPhone12 ProMax",
					"sku": "2006",
					"qty": 1,
					"price": 1500.00,
					"cost": 1500.00
				}
			],
			"total": 3100
		},
		{
			"street": "长庆街道",
			"city": "杭州",
			"state": "浙江省",
			"country": "中国",
			"zip": "24344-1717",
			"phone": "18866668883",
			"name": "苏东坡",
			"userId": "3576",
			"orderDate": "2019-04-02 15:20:08.805",
			"status": "completed",
			"shippingFee": 10.00,
			"orderLines": [
				{
					"product": "iPhone6s",
					"sku": "2004",
					"qty": 2,
					"price": 400.00,
					"cost": 800.00
				},
				{
					"product": "iPhone12 ProMax",
					"sku": "2006",
					"qty": 1,
					"price": 1500.00,
					"cost": 1500.00
				}
			],
			"total": 2300
		}
	]
)

4.2.2、计算到目前为止所有订单的销售额

db.orders.aggregate([
	{$group:{'_id':null,'total':{$sum:'$total'}}}
])

4.2.3、查询2019年第一季度(1月1日-3月31日)已完成订单(completed)的订单总金额和订单总数

db.orders.aggregate([
	{$match:{'status':'completed','orderDate':{$gte:'2019-01-01',$lt:'2019-04-01'}}},
	{$group:{'_id':null,'total':{$sum:'$total'},'shippingFee':{$sum:'$shippingFee'},'count':{$sum:1}}},
	{$project:{'grandTotal':{'$add':['$total','$shippingFee']},'count':'$count','_id':0}}
])

4.3、聚合查询三

4.3.1、数据初始化

db.articles.insert([
	{'_id':1,'name':'三国演义','author':'罗贯中','likes':1000},
	{'_id':2,'name':'水浒传','author':'施耐庵','likes':1002},
	{'_id':3,'name':'西游记','author':'吴承恩','likes':1004},
	{'_id':4,'name':'红楼梦','author':'曹雪芹','likes':1006}
])

4.3.2、$project:articles集合,查询书名和作者

db.articles.aggregate({
	$project: {'_id':0,'likes':0}
})

4.3.3、$match:articles集合,查询id大于1的书籍点赞数之和

db.articles.aggregate([
	{$match:{'_id':{$gt:1}}},
	{$group:{'_id':null,'stars':{$sum:'$likes'}}},
	{$project:{'_id':0}}
])

4.3.4、$skip:跳过前xxx条记录

db.articles.aggregate({
	$skip: 2
})

4.4、聚合查询四

4.4.1、数据初始化

说明:sale:false不打折商品、true打折商品

db.order_detail.insert([
    {'goodsid':'1001','amount':2,'price':10.2,'sale':false},
    {'goodsid':'1002','amount':3,'price':14.8,'sale':false},
    {'goodsid':'1003','amount':10,'price':50,'sale':false},
    {'goodsid':'1004','amount':2,'price':10,'sale':true}
])

4.4.2、查询不打折商品的销售总额

db.order_detail.aggregate([
	{$match:{'sale':false}},
	{$group:{'_id':null,'totalAmount':{$sum:{$multiply:['$amount','$price']}}}},
	{$project: {'totalAmount':'$totalAmount'}}
])

4.4.3、查询每件商品的销售总额

db.order_detail.aggregate([
	{$group:{'_id':'$goodsid','totalAmount':{$sum:{$multiply:['$amount','$price']}}}},
	{$project: {'totalAmount':'$totalAmount'}}
])

4.4.4、查询商品销售总数

db.order_detail.aggregate([
	{$group:{'_id':'$goodsid','totalNum':{$sum:'$amount'}}},
	{$project: {'totalNum':'$totalNum'}}
])

4.5、聚合查询五

4.5.1、初始化数据

# 中文
db.courses.insert(
	[
		{course:"隐私保护基础",teacher:"吴娟",classperiod:32,experimental_lessons:0,classnum:1},
		{course:"网络安全管理",teacher:"吴娟",classperiod:32,experimental_lessons:0,classnum:1},
		{course:"NoSQL数据库技术",teacher:"陈雨婕",classperiod:48,experimental_lessons:15,classnum:1},
		{course:"操作系统" ,teacher:"陈雨婕",classperiod:64,experimental_lessons:15,classnum:1},
		{course:"大数据处理技术",teacher:"陈雨婕" ,classperiod:48,experimental_lessons:15,classnum:3},
		{course:"人工智能",teacher:"邓敏娜",classperiod:48,experimental_lessons:15,classnum:1},
		{course:"算法分析与设计",teacher:"邓敏娜",classperiod:48,experimental_lessons:0,classnum:2},
		{course:"统计分析技术",teacher:"段红叶",classperiod:32,experimental_lessons:8,classnum:1},
		{course:"非结构化大数据分析",teacher:"段红叶" ,classperiod:32,experimental_lessons:0,classnum:1},
		{course:"计算机网络" ,teacher:"段红叶",classperiod:48,experimental_lessons:0,classnum:2},
		{course:"数据结构与算法课程实践",teacher:"段红叶",classperiod:32,experimental_lessons:0,classnum:1},
		{course:"三维动画模型与渲染",teacher:"韩战壕",classperiod:48,experimental_lessons:15,classnum:1},
		{course:"面向对象程序设计",teacher:"李贝贝",classperiod:64,experimental_lessons:15,classnum:2},
		{course:"面向对象程序设计课程实践",teacher:"李贝贝" ,classperiod:24,experimental_lessons:0,classnum:2},
		{course:"数字信号处理",teacher:"刘欢欢",classperiod:48,experimental_lessons:8,classnum:1},
		{course:"操作系统",teacher:"刘欢欢",classperiod:64,experimental_lessons:15,classnum:2},
		{course:"云计算技术" ,teacher:"王磊",classperiod:48,experimental_lessons:0,classnum:1},
		{course:"智能科学与技术导论",teacher:"王磊",classperiod:48,experimental_lessons:0,classnum:1},
		{course:"虚拟现实与可视化",teacher:"王磊" ,classperiod:48,experimental_lessons:15,classnum:1},
		{course:"系统设计与分析",teacher:"王磊",classperiod:48,experimental_lessons:0,classnum:2},
		{course:"数据库技术课程实践",teacher:"王磊",classperiod:32,experimental_lessons:0,classnum:1},
		{course:"面向对象程序设计",teacher:"韦茜妤",classperiod:64,experimental_lessons:15,classnum:2},
		{course:"面向对象程序设计课程实践",teacher:"韦茜妤" ,classperiod:24,experimental_lessons:0,classnum:2},
		{course:"HTML5开发技术",teacher:"韦茜妤",classperiod:48,experimental_lessons:15,classnum:2},
		{course:"数据结构与算法课程实践",teacher:"韦茜妤" ,classperiod:32,experimental_lessons:0,classnum:1}
	]
)

# 英文
db.courses_en_US.insert(
	[
		{course:"Privacy protection foundation",teacher:"wu juan",classperiod:32,experimental_lessons:0,classnum:1},
		{course:"Network Security Management",teacher:"wu juan",classperiod:32,experimental_lessons:0,classnum:1},
		{course:"NoSQL database technology",teacher:"chen yu jie",classperiod:48,experimental_lessons:15,classnum:1},
		{course:"operating system" ,teacher:"chen yu jie",classperiod:64,experimental_lessons:15,classnum:1},
		{course:"Big data Processing Technology",teacher:"chen yu jie" ,classperiod:48,experimental_lessons:15,classnum:3},
		{course:"artificial intelligence",teacher:"deng min na",classperiod:48,experimental_lessons:15,classnum:1},
		{course:"Algorithm Analysis and Design",teacher:"deng min na",classperiod:48,experimental_lessons:0,classnum:2},
		{course:"Statistical analysis techniques",teacher:"duan ye hong",classperiod:32,experimental_lessons:8,classnum:1},
		{course:"Unstructured Big data analysis",teacher:"duan ye hong" ,classperiod:32,experimental_lessons:0,classnum:1},
		{course:"computer network" ,teacher:"duan ye hong",classperiod:48,experimental_lessons:0,classnum:2},
		{course:"Practice of Data Structure and Algorithms Course",teacher:"duan ye hong",classperiod:32,experimental_lessons:0,classnum:1},
		{course:"3D Animation Model and Rendering",teacher:"han zhan hao",classperiod:48,experimental_lessons:15,classnum:1},
		{course:"Object-Oriented Programming",teacher:"li bei bei",classperiod:64,experimental_lessons:15,classnum:2},
		{course:"Object Oriented Programming Course Practice",teacher:"li bei bei" ,classperiod:24,experimental_lessons:0,classnum:2},
		{course:"Digital signal processing",teacher:"liu huan huan",classperiod:48,experimental_lessons:8,classnum:1},
		{course:"operating system",teacher:"liu huan huan",classperiod:64,experimental_lessons:15,classnum:2},
		{course:"Cloud computing technology" ,teacher:"wang lei",classperiod:48,experimental_lessons:0,classnum:1},
		{course:"Introduction to Intelligent Science and Technology",teacher:"wang lei",classperiod:48,experimental_lessons:0,classnum:1},
		{course:"Virtual Reality and Visualization",teacher:"wang lei" ,classperiod:48,experimental_lessons:15,classnum:1},
		{course:"system design and analysis",teacher:"wang lei",classperiod:48,experimental_lessons:0,classnum:2},
		{course:"Database Technology Course Practice",teacher:"wang lei",classperiod:32,experimental_lessons:0,classnum:1},
		{course:"Object-Oriented Programming",teacher:"wei qian yu",classperiod:64,experimental_lessons:15,classnum:2},
		{course:"Object Oriented Programming Course Practice",teacher:"wei qian yu" ,classperiod:24,experimental_lessons:0,classnum:2},
		{course:"HTML5 development technology",teacher:"wei qian yu",classperiod:48,experimental_lessons:15,classnum:2},
		{course:"Practice of Data Structure and Algorithms Course",teacher:"wei qian yu" ,classperiod:32,experimental_lessons:0,classnum:1}
	]
)

4.5.2、统计每个老师共上多少节课

db.courses.aggregate([
    {$group:{'_id':'$teacher','lesson_num':{$sum:{$multiply:['$classperiod','$classnum']}}}}
])

db.courses_en_US.aggregate([
    {$group:{'_id':'$teacher','lesson_num':{$sum:{$multiply:['$classperiod','$classnum']}}}}
])

4.5.3、课时超过32节课的老师有谁,他们分别有多少节超过32课时的课

db.courses.aggregate([
	{$match:{'classperiod':{$gt:32}}},
	{$group:{'_id':'$teacher','classnum':{$sum:1}}}
])

4.6、聚合查询六

4.6.1、初始化数据

链接:https://pan.baidu.com/s/1y1eKTisow4FTAKF6DFkzsw?pwd=yyds 
提取码:yyds 

使用说明:
1、下载脚本(脚本来源:https://media.mongodb.org/zips.json)
2、执行db.zips.insert(1中的脚本)

4.6.2、统计人口超过100万的州

db.aggregation.aggregate([
	{$group:{'_id':'$state','totalProp':{$sum:'$pop'}}},
	{$match:{'totalProp':{$gt:10 * 1000 * 1000}}}
])

等价于

select state,sum(pop) as totalProp from aggregation group by state having totalProp >= (10*1000*1000)

系列九、MongoDB聚合查询_第2张图片

4.6.3、返回各州平均城市人口

db.aggregation.aggregate(
	[
		{
			$group:
			{
				_id:
				{
					state:"$state",
					city: "$city" 
				}, 
				cityPop:
				{
					$sum: "$pop" 
				}
			} 
		},
		{
			$group:
			{
				_id:"$_id.state", 
				avgCityPop:
				{
					$avg:"$cityPop"
				} 
			} 
		},
		{
			$sort:
			{
				avgCityPop:-1
			}
		}
	]
)

系列九、MongoDB聚合查询_第3张图片

4.6.4、按州返回最大和最小城市

db.aggregation.aggregate(
	[
		{
			$group:
			{
				_id: { state: "$state", city: "$city" },
				pop: { $sum: "$pop" }
			}
		},
		{ 
			$sort: { pop: 1 } 
		},
		{ 
			$group:
			{
				_id : "$_id.state",
				biggestCity: { $last: "$_id.city" },
				biggestPop: { $last: "$pop" },
				smallestCity: { $first: "$_id.city" },
				smallestPop: { $first: "$pop" }
			}
		},
		{ 
			$project:
			{ 
				_id: 0,
				state: "$_id",
				biggestCity: { name: "$biggestCity", pop: "$biggestPop" },
				smallestCity: { name: "$smallestCity", pop: "$smallestPop" }
			}
		},
		{ 
			$sort: { state: 1 } 
		}
	]
)

系列九、MongoDB聚合查询_第4张图片

你可能感兴趣的:(MongoDB系列,mongodb,数据库)