MongoDB 中的 $unwind 运算符是什么?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/16448175/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 13:12:44  来源:igfitidea点击:

What's the $unwind operator in MongoDB?

mongodbaggregation-framework

提问by gremo

This is my first day with MongoDB so please go easy with me :)

这是我使用 MongoDB 的第一天,所以请和我一起轻松:)

I can't understand the $unwindoperator, maybe because English is not my native language.

我听不懂$unwind接线员,可能是因为英语不是我的母语。

db.article.aggregate(
    { $project : {
        author : 1 ,
        title : 1 ,
        tags : 1
    }},
    { $unwind : "$tags" }
);

The project operator is something I can understand, I suppose (it's like SELECT, isn't it?). But then, $unwind(citing) returns one document for every member of the unwound array within every source document.

项目运营商是我能理解的东西,我想(就像SELECT,不是吗?)。但是,$unwind(引用)为每个源文档中展开数组的每个成员返回一个文档

Is this like a JOIN? If yes, how the result of $project(with _id, author, titleand tagsfields) can be compared with the tagsarray?

这是像JOIN吗?如果是,如何将$project(with _id, author, titleand tagsfields)的结果与tags数组进行比较?

NOTE: I've taken the example from MongoDB website, I don't know the structure of tagsarray. I think it's a simple array of tag names.

注意:我从 MongoDB 网站上拿过例子,我不知道tags数组的结构。我认为这是一个简单的标签名称数组。

回答by HGS Labs

First off, welcome to MongoDB!

首先,欢迎来到 MongoDB!

The thing to remember is that MongoDB employs an "NoSQL" approach to data storage, so perish the thoughts of selects, joins, etc. from your mind. The way that it stores your data is in the form of documents and collections, which allows for a dynamic means of adding and obtaining the data from your storage locations.

需要记住的是,MongoDB 采用“NoSQL”方法来存储数据,因此请从您的脑海中消除选择、连接等的想法。它以文档和集合的形式存储您的数据,这允许以动态方式从您的存储位置添加和获取数据。

That being said, in order to understand the concept behind the $unwind parameter, you first must understand what the use case that you are trying to quote is saying. The example document from mongodb.orgis as follows:

话虽如此,为了理解 $unwind 参数背后的概念,您首先必须了解您试图引用的用例在说什么。来自mongodb.org的示例文档如下:

{
 title : "this is my title" ,
 author : "bob" ,
 posted : new Date () ,
 pageViews : 5 ,
 tags : [ "fun" , "good" , "fun" ] ,
 comments : [
             { author :"joe" , text : "this is cool" } ,
             { author :"sam" , text : "this is bad" }
 ],
 other : { foo : 5 }
}

Notice how tags is actually an array of 3 items, in this case being "fun", "good" and "fun".

注意标签实际上是一个由 3 个项目组成的数组,在这种情况下是“有趣”、“好”和“有趣”。

What $unwind does is allow you to peel off a document for each element and returns that resulting document. To think of this in a classical approach, it would be the equivilent of "for each item in the tags array, return a document with only that item".

$unwind 所做的是允许您为每个元素剥离文档并返回该结果文档。以经典方法来考虑这一点,这将等同于“对于标签数组中的每个项目,返回仅包含该项目的文档”。

Thus, the result of running the following:

因此,运行以下结果:

db.article.aggregate(
    { $project : {
        author : 1 ,
        title : 1 ,
        tags : 1
    }},
    { $unwind : "$tags" }
);

would return the following documents:

将返回以下文件:

{
     "result" : [
             {
                     "_id" : ObjectId("4e6e4ef557b77501a49233f6"),
                     "title" : "this is my title",
                     "author" : "bob",
                     "tags" : "fun"
             },
             {
                     "_id" : ObjectId("4e6e4ef557b77501a49233f6"),
                     "title" : "this is my title",
                     "author" : "bob",
                     "tags" : "good"
             },
             {
                     "_id" : ObjectId("4e6e4ef557b77501a49233f6"),
                     "title" : "this is my title",
                     "author" : "bob",
                     "tags" : "fun"
             }
     ],
     "OK" : 1
}

Notice that the only thing changing in the result array is what is being returned in the tags value. If you need an additional reference on how this works, I've included a link here. Hopefully this helps, and good luck with your foray into one of the best NoSQL systems that I have come across thus far.

请注意,结果数组中唯一更改的是标签值中返回的内容。如果您需要有关其工作原理的额外参考,我在此处提供了一个链接。希望这会有所帮助,并祝您进入我迄今为止遇到的最好的 NoSQL 系统之一。

回答by JohnnyHK

$unwindduplicates each document in the pipeline, once per array element.

$unwind复制管道中的每个文档,每个数组元素一次。

So if your input pipeline contains one article doc with two elements in tags, {$unwind: '$tags'}would transform the pipeline to be two article docs that are the same except for the tagsfield. In the first doc, tagswould contain the first element from the original doc's array, and in the second doc, tagswould contain the second element.

因此,如果您的输入管道包含一个具有两个元素的文章文档tags{$unwind: '$tags'}则会将管道转换为两个除tags字段外都相同的文章文档。在第一个文档中,tags将包含原始文档数组中的第一个元素,在第二个文档中,tags将包含第二个元素。

回答by xameeramir

Let's understand it by an example

让我们通过一个例子来理解它

This is how the companydocument looks like:

这是公司文件的样子:

original document

原始文件

The $unwindallows us to take documents as input that have an array valued field and produces output documents, such that there's one output document for each element in the array. source

$unwind允许我们将具有数组值字段的文档作为输入并生成输出文档,这样数组中的每个元素都有一个输出文档。来源

The $unwind stage

$unwind 阶段

So let's go back to our companies examples, and take a look at the use of unwind stages. This query:

那么让我们回到我们公司的例子,看看 unwind 阶段的使用。这个查询:


db.companies.aggregate([
    { $match: {"funding_rounds.investments.financial_org.permalink": "greylock" } },
    { $project: {
        _id: 0,
        name: 1,
        amount: "$funding_rounds.raised_amount",
        year: "$funding_rounds.funded_year"
    } }
])

produces documents that have arrays for both amount and year.

生成具有数量和年份数组的文档。

project output

项目产出

Because we're accessing the raised amount and the funded year for every element within the funding rounds array. To fix this, we can include an unwind stage before our project stage in this aggregation pipeline, and parameterize this by saying that we want to unwindthe funding rounds array:

因为我们正在访问资金轮次数组中每个元素的筹集金额和资助年份。为了解决这个问题,我们可以在这个聚合管道中的项目阶段之前包含一个展开阶段,并通过说我们想要unwind资金轮次数组来参数化它:


db.companies.aggregate([
    { $match: {"funding_rounds.investments.financial_org.permalink": "greylock" } },
    { $unwind: "$funding_rounds" },
    { $project: {
        _id: 0,
        name: 1,
        amount: "$funding_rounds.raised_amount",
        year: "$funding_rounds.funded_year"
    } }
])

unwind has the effect of outputting to the next stage more documents than it receives as input

unwind 的作用是输出到下一阶段的文档多于它作为输入接收的文档

If we look at the funding_roundsarray, we know that for each funding_rounds, there is a raised_amountand a funded_yearfield. So, unwindwill for each one of the documents that are elements of the funding_roundsarray produce an output document. Now, in this example, our values are strings. But, regardless of the type of value for the elements in an array, unwindwill produce an output document for each one of these values, such that the field in question will have just that element. In the case of funding_rounds, that element will be one of these documents as the value for funding_roundsfor every document that gets passed on to our projectstage. The result, then of having run this, is that now we get an amountand a year. One for each funding round for every companyin our collection. What this means is that our match produced many company documents and each one of those company documents results in many documents. One for each funding round within every company document. unwindperforms this operation using the documents handed to it from the matchstage. And all of these documents for every company are then passed to the projectstage.

如果我们查看funding_rounds数组,我们知道对于每个funding_rounds,都有一个raised_amount和一个funded_year字段。因此,unwind将为作为funding_rounds数组元素的每个文档生成一个输出文档。现在,在这个例子中,我们的值是strings。但是,无论数组中元素的值是什么类型,unwind都会为这些值中的每一个生成一个输出文档,这样所讨论的字段将只有该元素。在 的情况下funding_rounds,该元素将是这些文档之一,作为funding_rounds传递到我们project阶段的每个文档的值。运行它的结果是,现在我们得到 anamount和 a year。一个用于每一轮融资,每家公司在我们的收藏中。这意味着我们的匹配产生了许多公司文件,而这些公司文件中的每一份都会产生许多文件。每个公司文件中的每一轮融资都有一个。unwind使用从match舞台交给它的文件执行此操作。然后将每个公司的所有这些文件传递给project舞台。

unwind output

展开输出

So, all documents where the funder was Greylock(as in the query example) will be split into a number of documents, equal to the number of funding rounds for every company that matches the filter $match: {"funding_rounds.investments.financial_org.permalink": "greylock" }. And each one those resulting documents will then be passed along to our project. Now, unwindproduces an exact copy for every one of the documents that it receives as input. All fields have the same key and value, with one exception, and that is that the funding_roundsfield rather than being an array of funding_roundsdocuments, instead has a value that is a single document, which is an individual funding round. So, a company that has 4funding rounds will result in unwindcreating 4documents. Where every field is an exact copy, except for the funding_roundsfield, which will instead of being an array for each of those copies will instead be an individual element from the funding_roundsarray from the company document that unwindis currently processing. So, unwindhas the effect of outputting to the next stage more documents than it receives as input. What that means is that our projectstage now gets a funding_roundsfield that again, is not an array, but is instead a nested document that has a raised_amountand a funded_yearfield. So, projectwill receive multiple documents for each company matching the filter and can therefore process each of the documents individually and identify an individual amount and year for each funding round for each company.

因此,资助者为Greylock 的所有文档(如查询示例中所示)将被拆分为多个文档,等于与 filter 匹配的每个公司的融资轮数$match: {"funding_rounds.investments.financial_org.permalink": "greylock" }。然后每个生成的文件都将传递给我们的project. 现在,unwind为它作为输入接收的每个文档生成一个精确副本。所有字段都具有相同的键和值,但有一个例外,即该funding_rounds字段不是一组funding_rounds文档,而是具有单个文档的值,这是单个融资轮次。因此,拥有4轮融资的公司将导致unwind创建4文件。每个字段都是一个精确副本,除了funding_rounds字段,它不是每个副本的数组,而是来自当前正在处理funding_rounds的公司文档的数组中的一个单独元素unwind。因此,unwind输出到下一阶段的文档比它作为输入接收的文档多。这意味着我们的project舞台现在funding_rounds再次获得一个字段,该字段不是一个数组,而是一个包含 araised_amount和一个funded_year字段的嵌套文档。因此,projectmatch通过过滤器收到每个公司的多个文件,因此可以单独处理每个文件,并为每个公司的每轮融资确定单独的金额和年份.

回答by Amitesh

As per mongodb official documentation :

根据 mongodb 官方文档:

$unwindDeconstructs an array field from the input documents to output a document for each element. Each output document is the input document with the value of the array field replaced by the element.

$unwind从输入文档中解构一个数组字段,为每个元素输出一个文档。每个输出文档都是输入文档,其中数组字段的值被元素替换。

Explanation through basic example :

通过基本示例说明:

A collection inventory has the following documents:

馆藏清单具有以下文件:

{ "_id" : 1, "item" : "ABC", "sizes": [ "S", "M", "L"] }
{ "_id" : 2, "item" : "EFG", "sizes" : [ ] }
{ "_id" : 3, "item" : "IJK", "sizes": "M" }
{ "_id" : 4, "item" : "LMN" }
{ "_id" : 5, "item" : "XYZ", "sizes" : null }

The following $unwindoperations are equivalent and return a document for each element in the sizesfield. If the sizes field does not resolve to an array but is not missing, null, or an empty array, $unwind treats the non-array operand as a single element array.

以下 $ unwind操作是等效的,并为size字段中的每个元素返回一个文档。如果 size 字段未解析为数组但不丢失、为 null 或空数组,则 $unwind 将非数组操作数视为单元素数组。

db.inventory.aggregate( [ { $unwind: "$sizes" } ] )

or

或者

db.inventory.aggregate( [ { $unwind: { path: "$sizes" } } ] 

Above query output :

以上查询输出:

{ "_id" : 1, "item" : "ABC", "sizes" : "S" }
{ "_id" : 1, "item" : "ABC", "sizes" : "M" }
{ "_id" : 1, "item" : "ABC", "sizes" : "L" }
{ "_id" : 3, "item" : "IJK", "sizes" : "M" }

Why is it needed?

为什么需要它?

$unwind is very useful while performing aggregation. it breaks complex/nested document into simple document before performaing various operation like sorting, searcing etc.

$unwind 在执行聚合时非常有用。它在执行排序、搜索等各种操作之前将复杂/嵌套的文档分解为简单的文档。

To know more about $unwind :

要了解有关 $unwind 的更多信息:

https://docs.mongodb.com/manual/reference/operator/aggregation/unwind/

https://docs.mongodb.com/manual/reference/operator/aggregation/unwind/

To know more about aggregation :

要了解有关聚合的更多信息:

https://docs.mongodb.com/manual/reference/operator/aggregation-pipeline/

https://docs.mongodb.com/manual/reference/operator/aggregation-pipeline/

回答by Pravin

consider the below example to understand this Data in a collection

考虑以下示例以了解集合中的此数据

{
        "_id" : 1,
        "shirt" : "Half Sleeve",
        "sizes" : [
                "medium",
                "XL",
                "free"
        ]
}

Query -- db.test1.aggregate( [ { $unwind : "$sizes" } ] );

查询 -- db.test1.aggregate( [ { $unwind : "$sizes" } ] );

output

输出

{ "_id" : 1, "shirt" : "Half Sleeve", "sizes" : "medium" }
{ "_id" : 1, "shirt" : "Half Sleeve", "sizes" : "XL" }
{ "_id" : 1, "shirt" : "Half Sleeve", "sizes" : "free" }

回答by Jeb50

Let me explain in a way corelated to RDBMS way. This is the statement:

让我以与 RDBMS 相关的方式进行解释。这是声明:

db.article.aggregate(
    { $project : {
        author : 1 ,
        title : 1 ,
        tags : 1
    }},
    { $unwind : "$tags" }
);

to apply to the document / record:

申请文件/记录

{
 title : "this is my title" ,
 author : "bob" ,
 posted : new Date () ,
 pageViews : 5 ,
 tags : [ "fun" , "good" , "fun" ] ,
 comments : [
             { author :"joe" , text : "this is cool" } ,
             { author :"sam" , text : "this is bad" }
 ],
 other : { foo : 5 }
}

The $project / Selectsimply returns these field/columns as

$项目/选择简单地返回这些字段/列,

SELECTauthor, title, tags FROMarticle

文章中选择作者、标题、标签

Next is the fun part of Mongo, consider this array tags : [ "fun" , "good" , "fun" ]as another related table (can't be a lookup/reference table because values has some duplication) named "tags". Remember SELECT generally produces things vertical, so unwind the "tags" is to split()vertically into table "tags".

接下来是 Mongo 的有趣部分,将这个数组tags : [ "fun" , "good" , "fun" ]视为另一个名为“tags”的相关表(不能是查找/引用表,因为值有一些重复)。记住 SELECT 通常会产生垂直的东西,所以展开“标签”就是将split()垂直地分成表“标签”。

The end result of $project + $unwind: enter image description here

$project + $unwind 的最终结果: 在此处输入图片说明

Translate the output to JSON:

将输出转换为 JSON:

{ "author": "bob", "title": "this is my title", "tags": "fun"},
{ "author": "bob", "title": "this is my title", "tags": "good"},
{ "author": "bob", "title": "this is my title", "tags": "fun"}

Because we didn't tell Mongo to omit "_id" field, so it's auto-added.

因为我们没有告诉 Mongo 省略“_id”字段,所以它是自动添加的。

The key is to make it table-like to perform aggregation.

关键是让它像表格一样执行聚合。