MongoDB 中的 $unwind 运算符是什么?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/16448175/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What's the $unwind operator in MongoDB?
提问by gremo
This is my first day with MongoDB so please go easy with me :)
这是我使用 MongoDB 的第一天,所以请和我一起轻松:)
I can't understand the $unwind
operator, maybe because English is not my native language.
我听不懂$unwind
接线员,可能是因为英语不是我的母语。
db.article.aggregate(
{ $project : {
author : 1 ,
title : 1 ,
tags : 1
}},
{ $unwind : "$tags" }
);
The project operator is something I can understand, I suppose (it's like SELECT
, isn't it?). But then, $unwind
(citing) returns one document for every member of the unwound array within every source document.
项目运营商是我能理解的东西,我想(就像SELECT
,不是吗?)。但是,$unwind
(引用)为每个源文档中展开数组的每个成员返回一个文档。
Is this like a JOIN
? If yes, how the result of $project
(with _id
, author
, title
and tags
fields) can be compared with the tags
array?
这是像JOIN
吗?如果是,如何将$project
(with _id
, author
, title
and tags
fields)的结果与tags
数组进行比较?
NOTE: I've taken the example from MongoDB website, I don't know the structure of tags
array. I think it's a simple array of tag names.
注意:我从 MongoDB 网站上拿过例子,我不知道tags
数组的结构。我认为这是一个简单的标签名称数组。
回答by HGS Labs
First off, welcome to MongoDB!
首先,欢迎来到 MongoDB!
The thing to remember is that MongoDB employs an "NoSQL" approach to data storage, so perish the thoughts of selects, joins, etc. from your mind. The way that it stores your data is in the form of documents and collections, which allows for a dynamic means of adding and obtaining the data from your storage locations.
需要记住的是,MongoDB 采用“NoSQL”方法来存储数据,因此请从您的脑海中消除选择、连接等的想法。它以文档和集合的形式存储您的数据,这允许以动态方式从您的存储位置添加和获取数据。
That being said, in order to understand the concept behind the $unwind parameter, you first must understand what the use case that you are trying to quote is saying. The example document from mongodb.orgis as follows:
话虽如此,为了理解 $unwind 参数背后的概念,您首先必须了解您试图引用的用例在说什么。来自mongodb.org的示例文档如下:
{
title : "this is my title" ,
author : "bob" ,
posted : new Date () ,
pageViews : 5 ,
tags : [ "fun" , "good" , "fun" ] ,
comments : [
{ author :"joe" , text : "this is cool" } ,
{ author :"sam" , text : "this is bad" }
],
other : { foo : 5 }
}
Notice how tags is actually an array of 3 items, in this case being "fun", "good" and "fun".
注意标签实际上是一个由 3 个项目组成的数组,在这种情况下是“有趣”、“好”和“有趣”。
What $unwind does is allow you to peel off a document for each element and returns that resulting document. To think of this in a classical approach, it would be the equivilent of "for each item in the tags array, return a document with only that item".
$unwind 所做的是允许您为每个元素剥离文档并返回该结果文档。以经典方法来考虑这一点,这将等同于“对于标签数组中的每个项目,返回仅包含该项目的文档”。
Thus, the result of running the following:
因此,运行以下结果:
db.article.aggregate(
{ $project : {
author : 1 ,
title : 1 ,
tags : 1
}},
{ $unwind : "$tags" }
);
would return the following documents:
将返回以下文件:
{
"result" : [
{
"_id" : ObjectId("4e6e4ef557b77501a49233f6"),
"title" : "this is my title",
"author" : "bob",
"tags" : "fun"
},
{
"_id" : ObjectId("4e6e4ef557b77501a49233f6"),
"title" : "this is my title",
"author" : "bob",
"tags" : "good"
},
{
"_id" : ObjectId("4e6e4ef557b77501a49233f6"),
"title" : "this is my title",
"author" : "bob",
"tags" : "fun"
}
],
"OK" : 1
}
Notice that the only thing changing in the result array is what is being returned in the tags value. If you need an additional reference on how this works, I've included a link here. Hopefully this helps, and good luck with your foray into one of the best NoSQL systems that I have come across thus far.
请注意,结果数组中唯一更改的是标签值中返回的内容。如果您需要有关其工作原理的额外参考,我在此处提供了一个链接。希望这会有所帮助,并祝您进入我迄今为止遇到的最好的 NoSQL 系统之一。
回答by JohnnyHK
$unwind
duplicates each document in the pipeline, once per array element.
$unwind
复制管道中的每个文档,每个数组元素一次。
So if your input pipeline contains one article doc with two elements in tags
, {$unwind: '$tags'}
would transform the pipeline to be two article docs that are the same except for the tags
field. In the first doc, tags
would contain the first element from the original doc's array, and in the second doc, tags
would contain the second element.
因此,如果您的输入管道包含一个具有两个元素的文章文档tags
,{$unwind: '$tags'}
则会将管道转换为两个除tags
字段外都相同的文章文档。在第一个文档中,tags
将包含原始文档数组中的第一个元素,在第二个文档中,tags
将包含第二个元素。
回答by xameeramir
Let's understand it by an example
让我们通过一个例子来理解它
This is how the companydocument looks like:
这是公司文件的样子:
The $unwind
allows us to take documents as input that have an array valued field and produces output documents, such that there's one output document for each element in the array. source
这$unwind
允许我们将具有数组值字段的文档作为输入并生成输出文档,这样数组中的每个元素都有一个输出文档。来源
So let's go back to our companies examples, and take a look at the use of unwind stages. This query:
那么让我们回到我们公司的例子,看看 unwind 阶段的使用。这个查询:
db.companies.aggregate([
{ $match: {"funding_rounds.investments.financial_org.permalink": "greylock" } },
{ $project: {
_id: 0,
name: 1,
amount: "$funding_rounds.raised_amount",
year: "$funding_rounds.funded_year"
} }
])
produces documents that have arrays for both amount and year.
生成具有数量和年份数组的文档。
Because we're accessing the raised amount and the funded year for every element within the funding rounds array. To fix this, we can include an unwind stage before our project stage in this aggregation pipeline, and parameterize this by saying that we want to unwind
the funding rounds array:
因为我们正在访问资金轮次数组中每个元素的筹集金额和资助年份。为了解决这个问题,我们可以在这个聚合管道中的项目阶段之前包含一个展开阶段,并通过说我们想要unwind
资金轮次数组来参数化它:
db.companies.aggregate([
{ $match: {"funding_rounds.investments.financial_org.permalink": "greylock" } },
{ $unwind: "$funding_rounds" },
{ $project: {
_id: 0,
name: 1,
amount: "$funding_rounds.raised_amount",
year: "$funding_rounds.funded_year"
} }
])
If we look at the funding_rounds
array, we know that for each funding_rounds
, there is a raised_amount
and a funded_year
field. So, unwind
will for each one of the documents that are elements of the funding_rounds
array produce an output document. Now, in this example, our values are string
s. But, regardless of the type of value for the elements in an array, unwind
will produce an output document for each one of these values, such that the field in question will have just that element. In the case of funding_rounds
, that element will be one of these documents as the value for funding_rounds
for every document that gets passed on to our project
stage. The result, then of having run this, is that now we get an amount
and a year
. One for each funding round for every companyin our collection. What this means is that our match produced many company documents and each one of those company documents results in many documents. One for each funding round within every company document. unwind
performs this operation using the documents handed to it from the match
stage. And all of these documents for every company are then passed to the project
stage.
如果我们查看funding_rounds
数组,我们知道对于每个funding_rounds
,都有一个raised_amount
和一个funded_year
字段。因此,unwind
将为作为funding_rounds
数组元素的每个文档生成一个输出文档。现在,在这个例子中,我们的值是string
s。但是,无论数组中元素的值是什么类型,unwind
都会为这些值中的每一个生成一个输出文档,这样所讨论的字段将只有该元素。在 的情况下funding_rounds
,该元素将是这些文档之一,作为funding_rounds
传递到我们project
阶段的每个文档的值。运行它的结果是,现在我们得到 anamount
和 a year
。一个用于每一轮融资,每家公司在我们的收藏中。这意味着我们的匹配产生了许多公司文件,而这些公司文件中的每一份都会产生许多文件。每个公司文件中的每一轮融资都有一个。unwind
使用从match
舞台交给它的文件执行此操作。然后将每个公司的所有这些文件传递给project
舞台。
So, all documents where the funder was Greylock(as in the query example) will be split into a number of documents, equal to the number of funding rounds for every company that matches the filter $match: {"funding_rounds.investments.financial_org.permalink": "greylock" }
. And each one those resulting documents will then be passed along to our project
. Now, unwind
produces an exact copy for every one of the documents that it receives as input. All fields have the same key and value, with one exception, and that is that the funding_rounds
field rather than being an array of funding_rounds
documents, instead has a value that is a single document, which is an individual funding round. So, a company that has 4funding rounds will result in unwind
creating 4documents. Where every field is an exact copy, except for the funding_rounds
field, which will instead of being an array for each of those copies will instead be an individual element from the funding_rounds
array from the company document that unwind
is currently processing. So, unwind
has the effect of outputting to the next stage more documents than it receives as input. What that means is that our project
stage now gets a funding_rounds
field that again, is not an array, but is instead a nested document that has a raised_amount
and a funded_year
field. So, project
will receive multiple documents for each company match
ing the filter and can therefore process each of the documents individually and identify an individual amount and year for each funding round for each company.
因此,资助者为Greylock 的所有文档(如查询示例中所示)将被拆分为多个文档,等于与 filter 匹配的每个公司的融资轮数$match: {"funding_rounds.investments.financial_org.permalink": "greylock" }
。然后每个生成的文件都将传递给我们的project
. 现在,unwind
为它作为输入接收的每个文档生成一个精确副本。所有字段都具有相同的键和值,但有一个例外,即该funding_rounds
字段不是一组funding_rounds
文档,而是具有单个文档的值,这是单个融资轮次。因此,拥有4轮融资的公司将导致unwind
创建4文件。每个字段都是一个精确副本,除了funding_rounds
字段,它不是每个副本的数组,而是来自当前正在处理funding_rounds
的公司文档的数组中的一个单独元素unwind
。因此,unwind
输出到下一阶段的文档比它作为输入接收的文档多。这意味着我们的project
舞台现在funding_rounds
再次获得一个字段,该字段不是一个数组,而是一个包含 araised_amount
和一个funded_year
字段的嵌套文档。因此,project
将match
通过过滤器收到每个公司的多个文件,因此可以单独处理每个文件,并为每个公司的每轮融资确定单独的金额和年份.
回答by Amitesh
As per mongodb official documentation :
根据 mongodb 官方文档:
$unwindDeconstructs an array field from the input documents to output a document for each element. Each output document is the input document with the value of the array field replaced by the element.
$unwind从输入文档中解构一个数组字段,为每个元素输出一个文档。每个输出文档都是输入文档,其中数组字段的值被元素替换。
Explanation through basic example :
通过基本示例说明:
A collection inventory has the following documents:
馆藏清单具有以下文件:
{ "_id" : 1, "item" : "ABC", "sizes": [ "S", "M", "L"] }
{ "_id" : 2, "item" : "EFG", "sizes" : [ ] }
{ "_id" : 3, "item" : "IJK", "sizes": "M" }
{ "_id" : 4, "item" : "LMN" }
{ "_id" : 5, "item" : "XYZ", "sizes" : null }
The following $unwindoperations are equivalent and return a document for each element in the sizesfield. If the sizes field does not resolve to an array but is not missing, null, or an empty array, $unwind treats the non-array operand as a single element array.
以下 $ unwind操作是等效的,并为size字段中的每个元素返回一个文档。如果 size 字段未解析为数组但不丢失、为 null 或空数组,则 $unwind 将非数组操作数视为单元素数组。
db.inventory.aggregate( [ { $unwind: "$sizes" } ] )
or
或者
db.inventory.aggregate( [ { $unwind: { path: "$sizes" } } ]
Above query output :
以上查询输出:
{ "_id" : 1, "item" : "ABC", "sizes" : "S" }
{ "_id" : 1, "item" : "ABC", "sizes" : "M" }
{ "_id" : 1, "item" : "ABC", "sizes" : "L" }
{ "_id" : 3, "item" : "IJK", "sizes" : "M" }
Why is it needed?
为什么需要它?
$unwind is very useful while performing aggregation. it breaks complex/nested document into simple document before performaing various operation like sorting, searcing etc.
$unwind 在执行聚合时非常有用。它在执行排序、搜索等各种操作之前将复杂/嵌套的文档分解为简单的文档。
To know more about $unwind :
要了解有关 $unwind 的更多信息:
https://docs.mongodb.com/manual/reference/operator/aggregation/unwind/
https://docs.mongodb.com/manual/reference/operator/aggregation/unwind/
To know more about aggregation :
要了解有关聚合的更多信息:
https://docs.mongodb.com/manual/reference/operator/aggregation-pipeline/
https://docs.mongodb.com/manual/reference/operator/aggregation-pipeline/
回答by Pravin
consider the below example to understand this Data in a collection
考虑以下示例以了解集合中的此数据
{
"_id" : 1,
"shirt" : "Half Sleeve",
"sizes" : [
"medium",
"XL",
"free"
]
}
Query -- db.test1.aggregate( [ { $unwind : "$sizes" } ] );
查询 -- db.test1.aggregate( [ { $unwind : "$sizes" } ] );
output
输出
{ "_id" : 1, "shirt" : "Half Sleeve", "sizes" : "medium" }
{ "_id" : 1, "shirt" : "Half Sleeve", "sizes" : "XL" }
{ "_id" : 1, "shirt" : "Half Sleeve", "sizes" : "free" }
回答by Jeb50
Let me explain in a way corelated to RDBMS way. This is the statement:
让我以与 RDBMS 相关的方式进行解释。这是声明:
db.article.aggregate(
{ $project : {
author : 1 ,
title : 1 ,
tags : 1
}},
{ $unwind : "$tags" }
);
to apply to the document / record:
申请文件/记录:
{
title : "this is my title" ,
author : "bob" ,
posted : new Date () ,
pageViews : 5 ,
tags : [ "fun" , "good" , "fun" ] ,
comments : [
{ author :"joe" , text : "this is cool" } ,
{ author :"sam" , text : "this is bad" }
],
other : { foo : 5 }
}
The $project / Selectsimply returns these field/columns as
在$项目/选择简单地返回这些字段/列,
SELECTauthor, title, tags FROMarticle
从文章中选择作者、标题、标签
Next is the fun part of Mongo, consider this array tags : [ "fun" , "good" , "fun" ]
as another related table (can't be a lookup/reference table because values has some duplication) named "tags". Remember SELECT generally produces things vertical, so unwind the "tags" is to split()vertically into table "tags".
接下来是 Mongo 的有趣部分,将这个数组tags : [ "fun" , "good" , "fun" ]
视为另一个名为“tags”的相关表(不能是查找/引用表,因为值有一些重复)。记住 SELECT 通常会产生垂直的东西,所以展开“标签”就是将split()垂直地分成表“标签”。
The end result of $project + $unwind:
Translate the output to JSON:
将输出转换为 JSON:
{ "author": "bob", "title": "this is my title", "tags": "fun"},
{ "author": "bob", "title": "this is my title", "tags": "good"},
{ "author": "bob", "title": "this is my title", "tags": "fun"}
Because we didn't tell Mongo to omit "_id" field, so it's auto-added.
因为我们没有告诉 Mongo 省略“_id”字段,所以它是自动添加的。
The key is to make it table-like to perform aggregation.
关键是让它像表格一样执行聚合。