由于MongoDB的数据非结构化,在进行表连接时经常会碰到一些问题,而空值则是非常常见的。
假设有两个collection,如下:
department:
/* 1 */
{
"_id" : ObjectId("59c137a7f6c0f783dff3f803"),
"deptNo" : "1001",
"description" : "HR"
}
/* 2 */
{
"_id" : ObjectId("59c137a7f6c0f783dff3f804"),
"deptNo" : "1002",
"description" : "IT"
}
/* 3 */
{
"_id" : ObjectId("59c13862f6c0f783dff3f805"),
"deptNo" : null,
"description" : "marketing"
}
employee:
/* 1 */
{
"_id" : ObjectId("59c137a7f6c0f783dff3f7fc"),
"eno" : "2001",
"name" : "Jack",
"department" : "1001",
"gender" : "male"
}
/* 2 */
{
"_id" : ObjectId("59c137a7f6c0f783dff3f7fd"),
"eno" : "2002",
"name" : "Tom",
"department" : "1001",
"gender" : "male"
}
/* 3 */
{
"_id" : ObjectId("59c137a7f6c0f783dff3f7fe"),
"eno" : "2003",
"name" : "Tony",
"department" : "1001",
"gender" : "male"
}
/* 4 */
{
"_id" : ObjectId("59c137a7f6c0f783dff3f7ff"),
"eno" : "2004",
"name" : "Alice",
"department" : "1002",
"gender" : "female"
}
/* 5 */
{
"_id" : ObjectId("59c137a7f6c0f783dff3f800"),
"eno" : "2005",
"name" : "Jenny",
"department" : "1002",
"gender" : "female"
}
/* 6 */
{
"_id" : ObjectId("59c137a7f6c0f783dff3f801"),
"eno" : "2006",
"name" : "Angel",
"department" : null,
"gender" : "female"
}
然后我们用aggregate管道进行lookup操作将两张表连接起来
db.employee.aggregate([
{
$lookup:{
from:"department",
localField:"department",
foreignField:"deptNo",
as:"departmentInfo"
}
},
{
$unwind:{
path:"$departmentInfo",
preserveNullAndEmptyArrays:true
}
}
])
通过robomongo管理工具可以清晰地看到employee中department为null的记录连接上了一条department的记录
通过查看代码发现连接的是department中的deptNo为null的记录
{
"_id" : ObjectId("59c137a7f6c0f783dff3f801"),
"eno" : "2006",
"name" : "Angel",
"department" : null,
"gender" : "female",
"departmentInfo" : {
"_id" : ObjectId("59c13862f6c0f783dff3f805"),
"deptNo" : null,
"description" : "marketing"
}
}
到这里我们可以发现,MongoDB在进行表连接时会通过null来进行等值连接。那么我们该如何解决这个问题呢?
我们可以在lookup操作前新增一个字段用以进行lookup,当作为“外键”的字段为null是将其修改为不能用以连接的数据,比如:
db.employee.aggregate([
{
$addFields:{
deptNumber : {
$cond : {
if : {$eq : ["$department", null]},
then : "invalidNumber",
else : "$department"
}
}
}
},
{
$lookup:{
from:"department",
localField:"deptNumber",
foreignField:"deptNo",
as:"departmentInfo"
}
},
{
$unwind:{
path:"$departmentInfo",
preserveNullAndEmptyArrays:true
}
}
])
通过上面的操作之后我们可以发现department为null的记录不再连接department表中的记录。
上面这种操作似乎已经解决了问题,但是我们先看一下以下例子。
往department表中插入一条新记录:
db.department.insert({description : "financial"})
往employee表中也插入一条新记录:
db.employee.insert(
{eno : "2007", name : "Zoe", gender : "female"}
)
可以注意到,这两条记录都没有关于department number的字段,用上面讲过改良的方法进行lookup,结果如下:
我们惊奇地发现,刚才新添加记录竟然连接了两条department的记录,
查看代码
/* 7 */
{
"_id" : ObjectId("59c14a9bf6c0f783dff3f808"),
"eno" : "2007",
"name" : "Zoe",
"gender" : "female",
"departmentInfo" : {
"_id" : ObjectId("59c13862f6c0f783dff3f805"),
"deptNo" : null,
"description" : "marketing"
}
}
/* 8 */
{
"_id" : ObjectId("59c14a9bf6c0f783dff3f808"),
"eno" : "2007",
"name" : "Zoe",
"gender" : "female",
"departmentInfo" : {
"_id" : ObjectId("59c14a24f6c0f783dff3f807"),
"description" : "financial"
}
}
发现name为Zoe,部门编号没有说明的这条记录连上了deptNo为null和没有deptNo的两条记录。MongoDB在进行lookup操作的时候同时将不存在视为null,所以这里lookup了两条记录。而上面我们只对字段为null进行了处理,而没有将字段不存在进行处理,所以这里仍然出现了问题。
通过查阅官方文档,可以知道$ifNull
操作可以将值为null或者不存在的字段进行替换。所以将解决方法修改如下:
db.employee.aggregate([
{
$addFields:{
deptNumber : {
$ifNull : ["$department", "invalidNumber"]
}
}
},
{
$lookup:{
from:"department",
localField:"deptNumber",
foreignField:"deptNo",
as:"departmentInfo"
}
},
{
$unwind:{
path:"$departmentInfo",
preserveNullAndEmptyArrays:true
}
}
])
结果如下:
部门编号为null或者不存在的记录将不会连接到department表。