mongodb 查找数组字段不为空的MongoDB记录

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/14789684/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 13:02:55  来源:igfitidea点击:

Find MongoDB records where array field is not empty

mongodbmongoose

提问by skerit

All of my records have a field called "pictures". This field is an array of strings.

我所有的记录都有一个名为“图片”的字段。该字段是一个字符串数组。

I now want the newest 10 records where this array IS NOT empty.

我现在想要这个数组不为空的最新 10 条记录。

I've googled around, but strangely enough I haven't found much on this. I've read into the $where option, but I was wondering how slow that is to native functions, and if there is a better solution.

我已经用谷歌搜索过了,但奇怪的是我没有找到太多关于这个的信息。我已经阅读了 $where 选项,但我想知道这对本机函数有多慢,以及是否有更好的解决方案。

And even then, that does not work:

即便如此,这也不起作用:

ME.find({$where: 'this.pictures.length > 0'}).sort('-created').limit(10).execFind()

Returns nothing. Leaving this.pictureswithout the length bit does work, but then it also returns empty records, of course.

什么都不返回。离开this.pictures没有长度位确实有效,但当然它也会返回空记录。

回答by Chris'

If you also have documents that don't have the key, you can use:

如果您还有没有密钥的文档,您可以使用:

ME.find({ pictures: { $exists: true, $not: {$size: 0} } })

MongoDB don't use indexes if $size is involved, so here is a better solution:

如果涉及 $size,MongoDB 不使用索引,所以这里有一个更好的解决方案:

ME.find({ pictures: { $exists: true, $ne: [] } })

Since MongoDB 2.6 release, you can compare with the operator $gtbut could lead to unexpected results (you can find a detailled explanation in this answer):

从 MongoDB 2.6 版本开始,您可以与运算符进行比较,$gt但可能会导致意外结果(您可以在此答案中找到详细解释):

ME.find({ pictures: { $gt: [] } })

回答by skerit

After some more looking, especially in the mongodb documents, and puzzling bits together, this was the answer:

经过更多的查找,尤其是在 mongodb 文档中,并将一些令人费解的部分放在一起,这就是答案:

ME.find({pictures: {$exists: true, $not: {$size: 0}}})

回答by tenbatsu

This might also work for you:

这也可能对您有用:

ME.find({'pictures.0': {$exists: true}});

回答by wojcikstefan

You care about two things when querying - accuracy and performance. With that in mind, I tested a few different approaches in MongoDB v3.0.14.

查询时您关心两件事 - 准确性和性能。考虑到这一点,我在 MongoDB v3.0.14 中测试了几种不同的方法。

TL;DR db.doc.find({ nums: { $gt: -Infinity }})is the fastest and most reliable (at least in the MongoDB version I tested).

TL;DRdb.doc.find({ nums: { $gt: -Infinity }})是最快和最可靠的(至少在我测试的 MongoDB 版本中)。

EDIT: This no longer works in MongoDB v3.6! See the comments under this post for a potential solution.

编辑:这不再适用于 MongoDB v3.6!请参阅此帖子下的评论以获取潜在的解决方案。

Setup

设置

I inserted 1k docs w/o a list field, 1k docs with an empty list, and 5 docs with a non-empty list.

我插入了 1k 份文档(不包含列表字段)、1k 份文档带有空列表,以及 5 份文档带有非空列表。

for (var i = 0; i < 1000; i++) { db.doc.insert({}); }
for (var i = 0; i < 1000; i++) { db.doc.insert({ nums: [] }); }
for (var i = 0; i < 5; i++) { db.doc.insert({ nums: [1, 2, 3] }); }
db.doc.createIndex({ nums: 1 });

I recognize this isn't enough of a scale to take performance as seriously as I am in the tests below, but it's enough to present the correctness of various queries and behavior of chosen query plans.

我认识到这不足以像我在下面的测试中那样认真对待性能,但它足以展示各种查询的正确性和所选查询计划的行为。

Tests

测试

db.doc.find({'nums': {'$exists': true}})returns wrong results (for what we're trying to accomplish).

db.doc.find({'nums': {'$exists': true}})返回错误的结果(对于我们想要完成的事情)。

MacBook-Pro(mongod-3.0.14) test> db.doc.find({'nums': {'$exists': true}}).count()
1005

--

——

db.doc.find({'nums.0': {'$exists': true}})returns correct results, but it's also slow using a full collection scan (notice COLLSCANstage in the explanation).

db.doc.find({'nums.0': {'$exists': true}})返回正确的结果,但使用完整集合扫描也很慢(COLLSCAN解释中的注意阶段)。

MacBook-Pro(mongod-3.0.14) test> db.doc.find({'nums.0': {'$exists': true}}).count()
5
MacBook-Pro(mongod-3.0.14) test> db.doc.find({'nums.0': {'$exists': true}}).explain()
{
  "queryPlanner": {
    "plannerVersion": 1,
    "namespace": "test.doc",
    "indexFilterSet": false,
    "parsedQuery": {
      "nums.0": {
        "$exists": true
      }
    },
    "winningPlan": {
      "stage": "COLLSCAN",
      "filter": {
        "nums.0": {
          "$exists": true
        }
      },
      "direction": "forward"
    },
    "rejectedPlans": [ ]
  },
  "serverInfo": {
    "host": "MacBook-Pro",
    "port": 27017,
    "version": "3.0.14",
    "gitVersion": "08352afcca24bfc145240a0fac9d28b978ab77f3"
  },
  "ok": 1
}

--

——

db.doc.find({'nums': { $exists: true, $gt: { '$size': 0 }}})returns wrong results. That's because of an invalid index scan advancing no documents. It will likely be accurate but slow without the index.

db.doc.find({'nums': { $exists: true, $gt: { '$size': 0 }}})返回错误的结果。那是因为无效的索引扫描不推进任何文档。如果没有索引,它可能会准确但很慢。

MacBook-Pro(mongod-3.0.14) test> db.doc.find({'nums': { $exists: true, $gt: { '$size': 0 }}}).count()
0
MacBook-Pro(mongod-3.0.14) test> db.doc.find({'nums': { $exists: true, $gt: { '$size': 0 }}}).explain('executionStats').executionStats.executionStages
{
  "stage": "KEEP_MUTATIONS",
  "nReturned": 0,
  "executionTimeMillisEstimate": 0,
  "works": 2,
  "advanced": 0,
  "needTime": 0,
  "needFetch": 0,
  "saveState": 0,
  "restoreState": 0,
  "isEOF": 1,
  "invalidates": 0,
  "inputStage": {
    "stage": "FETCH",
    "filter": {
      "$and": [
        {
          "nums": {
            "$gt": {
              "$size": 0
            }
          }
        },
        {
          "nums": {
            "$exists": true
          }
        }
      ]
    },
    "nReturned": 0,
    "executionTimeMillisEstimate": 0,
    "works": 1,
    "advanced": 0,
    "needTime": 0,
    "needFetch": 0,
    "saveState": 0,
    "restoreState": 0,
    "isEOF": 1,
    "invalidates": 0,
    "docsExamined": 0,
    "alreadyHasObj": 0,
    "inputStage": {
      "stage": "IXSCAN",
      "nReturned": 0,
      "executionTimeMillisEstimate": 0,
      "works": 1,
      "advanced": 0,
      "needTime": 0,
      "needFetch": 0,
      "saveState": 0,
      "restoreState": 0,
      "isEOF": 1,
      "invalidates": 0,
      "keyPattern": {
        "nums": 1
      },
      "indexName": "nums_1",
      "isMultiKey": true,
      "direction": "forward",
      "indexBounds": {
        "nums": [
          "({ $size: 0.0 }, [])"
        ]
      },
      "keysExamined": 0,
      "dupsTested": 0,
      "dupsDropped": 0,
      "seenInvalidated": 0,
      "matchTested": 0
    }
  }
}

--

——

db.doc.find({'nums': { $exists: true, $not: { '$size': 0 }}})returns correct results, but the performance is bad. It technically does an index scan, but then it still advances all the docs and then has to filter through them).

db.doc.find({'nums': { $exists: true, $not: { '$size': 0 }}})返回正确的结果,但性能很差。从技术上讲,它会进行索引扫描,但它仍然会推进所有文档,然后必须过滤它们)。

MacBook-Pro(mongod-3.0.14) test> db.doc.find({'nums': { $exists: true, $not: { '$size': 0 }}}).count()
5
MacBook-Pro(mongod-3.0.14) test> db.doc.find({'nums': { $exists: true, $not: { '$size': 0 }}}).explain('executionStats').executionStats.executionStages
{
  "stage": "KEEP_MUTATIONS",
  "nReturned": 5,
  "executionTimeMillisEstimate": 0,
  "works": 2016,
  "advanced": 5,
  "needTime": 2010,
  "needFetch": 0,
  "saveState": 15,
  "restoreState": 15,
  "isEOF": 1,
  "invalidates": 0,
  "inputStage": {
    "stage": "FETCH",
    "filter": {
      "$and": [
        {
          "nums": {
            "$exists": true
          }
        },
        {
          "$not": {
            "nums": {
              "$size": 0
            }
          }
        }
      ]
    },
    "nReturned": 5,
    "executionTimeMillisEstimate": 0,
    "works": 2016,
    "advanced": 5,
    "needTime": 2010,
    "needFetch": 0,
    "saveState": 15,
    "restoreState": 15,
    "isEOF": 1,
    "invalidates": 0,
    "docsExamined": 2005,
    "alreadyHasObj": 0,
    "inputStage": {
      "stage": "IXSCAN",
      "nReturned": 2005,
      "executionTimeMillisEstimate": 0,
      "works": 2015,
      "advanced": 2005,
      "needTime": 10,
      "needFetch": 0,
      "saveState": 15,
      "restoreState": 15,
      "isEOF": 1,
      "invalidates": 0,
      "keyPattern": {
        "nums": 1
      },
      "indexName": "nums_1",
      "isMultiKey": true,
      "direction": "forward",
      "indexBounds": {
        "nums": [
          "[MinKey, MaxKey]"
        ]
      },
      "keysExamined": 2015,
      "dupsTested": 2015,
      "dupsDropped": 10,
      "seenInvalidated": 0,
      "matchTested": 0
    }
  }
}

--

——

db.doc.find({'nums': { $exists: true, $ne: [] }})returns correct results and is slightly faster, but the performance is still not ideal. It uses IXSCAN which only advances docs with an existing list field, but then has to filter out the empty lists one by one.

db.doc.find({'nums': { $exists: true, $ne: [] }})返回正确的结果,速度稍快,但性能仍然不理想。它使用 IXSCAN,它只使用现有列表字段推进文档,但必须一一过滤掉空列表。

MacBook-Pro(mongod-3.0.14) test> db.doc.find({'nums': { $exists: true, $ne: [] }}).count()
5
MacBook-Pro(mongod-3.0.14) test> db.doc.find({'nums': { $exists: true, $ne: [] }}).explain('executionStats').executionStats.executionStages
{
  "stage": "KEEP_MUTATIONS",
  "nReturned": 5,
  "executionTimeMillisEstimate": 0,
  "works": 1018,
  "advanced": 5,
  "needTime": 1011,
  "needFetch": 0,
  "saveState": 15,
  "restoreState": 15,
  "isEOF": 1,
  "invalidates": 0,
  "inputStage": {
    "stage": "FETCH",
    "filter": {
      "$and": [
        {
          "$not": {
            "nums": {
              "$eq": [ ]
            }
          }
        },
        {
          "nums": {
            "$exists": true
          }
        }
      ]
    },
    "nReturned": 5,
    "executionTimeMillisEstimate": 0,
    "works": 1017,
    "advanced": 5,
    "needTime": 1011,
    "needFetch": 0,
    "saveState": 15,
    "restoreState": 15,
    "isEOF": 1,
    "invalidates": 0,
    "docsExamined": 1005,
    "alreadyHasObj": 0,
    "inputStage": {
      "stage": "IXSCAN",
      "nReturned": 1005,
      "executionTimeMillisEstimate": 0,
      "works": 1016,
      "advanced": 1005,
      "needTime": 11,
      "needFetch": 0,
      "saveState": 15,
      "restoreState": 15,
      "isEOF": 1,
      "invalidates": 0,
      "keyPattern": {
        "nums": 1
      },
      "indexName": "nums_1",
      "isMultiKey": true,
      "direction": "forward",
      "indexBounds": {
        "nums": [
          "[MinKey, undefined)",
          "(undefined, [])",
          "([], MaxKey]"
        ]
      },
      "keysExamined": 1016,
      "dupsTested": 1015,
      "dupsDropped": 10,
      "seenInvalidated": 0,
      "matchTested": 0
    }
  }
}

--

——

db.doc.find({'nums': { $gt: [] }})IS DANGEROUS BECAUSE DEPENDING ON THE INDEX USED IT MIGHT GIVE UNEXPECTED RESULTS. That's because of an invalid index scan which advances no documents.

db.doc.find({'nums': { $gt: [] }})是危险的,因为取决于所使用的指数,它可能会产生意想不到的结果。这是因为无效的索引扫描不会推进任何文档。

MacBook-Pro(mongod-3.0.14) test> db.doc.find({'nums': { $gt: [] }}).count()
0
MacBook-Pro(mongod-3.0.14) test> db.doc.find({'nums': { $gt: [] }}).hint({ nums: 1 }).count()
0
MacBook-Pro(mongod-3.0.14) test> db.doc.find({'nums': { $gt: [] }}).hint({ _id: 1 }).count()
5

MacBook-Pro(mongod-3.0.14) test> db.doc.find({'nums': { $gt: [] }}).explain('executionStats').executionStats.executionStages
{
  "stage": "KEEP_MUTATIONS",
  "nReturned": 0,
  "executionTimeMillisEstimate": 0,
  "works": 1,
  "advanced": 0,
  "needTime": 0,
  "needFetch": 0,
  "saveState": 0,
  "restoreState": 0,
  "isEOF": 1,
  "invalidates": 0,
  "inputStage": {
    "stage": "FETCH",
    "filter": {
      "nums": {
        "$gt": [ ]
      }
    },
    "nReturned": 0,
    "executionTimeMillisEstimate": 0,
    "works": 1,
    "advanced": 0,
    "needTime": 0,
    "needFetch": 0,
    "saveState": 0,
    "restoreState": 0,
    "isEOF": 1,
    "invalidates": 0,
    "docsExamined": 0,
    "alreadyHasObj": 0,
    "inputStage": {
      "stage": "IXSCAN",
      "nReturned": 0,
      "executionTimeMillisEstimate": 0,
      "works": 1,
      "advanced": 0,
      "needTime": 0,
      "needFetch": 0,
      "saveState": 0,
      "restoreState": 0,
      "isEOF": 1,
      "invalidates": 0,
      "keyPattern": {
        "nums": 1
      },
      "indexName": "nums_1",
      "isMultiKey": true,
      "direction": "forward",
      "indexBounds": {
        "nums": [
          "([], BinData(0, ))"
        ]
      },
      "keysExamined": 0,
      "dupsTested": 0,
      "dupsDropped": 0,
      "seenInvalidated": 0,
      "matchTested": 0
    }
  }
}

--

——

db.doc.find({'nums.0': { $gt: -Infinity }})returns correct results, but has bad performance (uses a full collection scan).

db.doc.find({'nums.0': { $gt: -Infinity }})返回正确的结果,但性能不佳(使用完整集合扫描)。

MacBook-Pro(mongod-3.0.14) test> db.doc.find({'nums.0': { $gt: -Infinity }}).count()
5
MacBook-Pro(mongod-3.0.14) test> db.doc.find({'nums.0': { $gt: -Infinity }}).explain('executionStats').executionStats.executionStages
{
  "stage": "COLLSCAN",
  "filter": {
    "nums.0": {
      "$gt": -Infinity
    }
  },
  "nReturned": 5,
  "executionTimeMillisEstimate": 0,
  "works": 2007,
  "advanced": 5,
  "needTime": 2001,
  "needFetch": 0,
  "saveState": 15,
  "restoreState": 15,
  "isEOF": 1,
  "invalidates": 0,
  "direction": "forward",
  "docsExamined": 2005
}

--

——

db.doc.find({'nums': { $gt: -Infinity }})surprisingly, this works very well! It gives the right results and it's fast, advancing 5 docs from the index scan phase.

db.doc.find({'nums': { $gt: -Infinity }})令人惊讶的是,这非常有效!它给出了正确的结果,而且速度很快,从索引扫描阶段推进了 5 个文档。

MacBook-Pro(mongod-3.0.14) test> db.doc.find({'nums': { $gt: -Infinity }}).explain('executionStats').executionStats.executionStages
{
  "stage": "FETCH",
  "nReturned": 5,
  "executionTimeMillisEstimate": 0,
  "works": 16,
  "advanced": 5,
  "needTime": 10,
  "needFetch": 0,
  "saveState": 0,
  "restoreState": 0,
  "isEOF": 1,
  "invalidates": 0,
  "docsExamined": 5,
  "alreadyHasObj": 0,
  "inputStage": {
    "stage": "IXSCAN",
    "nReturned": 5,
    "executionTimeMillisEstimate": 0,
    "works": 15,
    "advanced": 5,
    "needTime": 10,
    "needFetch": 0,
    "saveState": 0,
    "restoreState": 0,
    "isEOF": 1,
    "invalidates": 0,
    "keyPattern": {
      "nums": 1
    },
    "indexName": "nums_1",
    "isMultiKey": true,
    "direction": "forward",
    "indexBounds": {
      "nums": [
        "(-inf.0, inf.0]"
      ]
    },
    "keysExamined": 15,
    "dupsTested": 15,
    "dupsDropped": 10,
    "seenInvalidated": 0,
    "matchTested": 0
  }
}

回答by JohnnyHK

Starting with the 2.6 release, another way to do this is to compare the field to an empty array:

从 2.6 版本开始,另一种方法是将字段与空数组进行比较:

ME.find({pictures: {$gt: []}})

Testing it out in the shell:

在 shell 中测试它:

> db.ME.insert([
{pictures: [1,2,3]},
{pictures: []},
{pictures: ['']},
{pictures: [0]},
{pictures: 1},
{foobar: 1}
])

> db.ME.find({pictures: {$gt: []}})
{ "_id": ObjectId("54d4d9ff96340090b6c1c4a7"), "pictures": [ 1, 2, 3 ] }
{ "_id": ObjectId("54d4d9ff96340090b6c1c4a9"), "pictures": [ "" ] }
{ "_id": ObjectId("54d4d9ff96340090b6c1c4aa"), "pictures": [ 0 ] }

So it properly includes the docs where pictureshas at least one array element, and excludes the docs where picturesis either an empty array, not an array, or missing.

因此,它正确地包含了pictures至少有一个数组元素的文档,并排除了pictures空数组、非数组或缺失的文档。

回答by Paul Imisi

You can use any of the following to achieve this.
Both also take care of not returning a result for objects that don't have the requested key in them:

您可以使用以下任何一种方法来实现此目的。
两者还注意不为其中没有请求的键的对象返回结果:

db.video.find({pictures: {$exists: true, $gt: {$size: 0}}})
db.video.find({comments: {$exists: true, $not: {$size: 0}}})

回答by SC1000

Retrieve all and only the documents where 'pictures' is an array and is not empty

检索所有且仅检索“图片”为数组且不为空的文档

ME.find({pictures: {$type: 'array', $ne: []}})

If using a MongoDb version prior to 3.2, use $type: 4instead of $type: 'array'. Notice that this solution doesn't even use $size, so there's no problem with indexes ("Queries cannot use indexes for the $size portion of a query")

如果使用3.2之前的 MongoDb 版本,请使用$type: 4代替$type: 'array'。请注意,此解决方案甚至不使用$size,因此索引没有问题(“查询不能对查询的 $size 部分使用索引”)

Other solutions, including these (accepted answer):

其他解决方案,包括这些(已接受的答案):

ME.find({ pictures: { $exists: true, $not: {$size: 0} } }); ME.find({ pictures: { $exists: true, $ne: [] } })

ME.find({ 图片: { $exists: true, $not: {$size: 0} } }); ME.find({ 图片: { $exists: true, $ne: [] } })

are wrongbecause they return documents even if, for example, 'pictures' is null, undefined, 0, etc.

错误的,因为它们会返回文档,例如,“图片”是nullundefined、 0 等。

回答by Andres Moreno

Use the $elemMatchoperator: according to the documentation

使用$elemMatch操作符:根据文档

The $elemMatch operator matches documents that contain an array field with at least one element that matches all the specified query criteria.

$elemMatch 运算符匹配包含至少一个元素匹配所有指定查询条件的数组字段的文档。

$elemMatchesmakes sure that the value is an array and that it is not empty. So the query would be something like

$elemMatches确保该值是一个数组并且它不为空。所以查询将类似于

ME.find({ pictures: { $elemMatch: {$exists: true }}})

ME.find({ pictures: { $elemMatch: {$exists: true }}})

PS A variant of this code is found in MongoDB University's M121 course.

PS 在 MongoDB 大学的 M121 课程中可以找到此代码的变体。

回答by Eat at Joes

You can also use the helper method Existsover the Mongo operator $exists

您还可以在 Mongo 运算符 $exists 上使用辅助方法Exists

ME.find()
    .exists('pictures')
    .where('pictures').ne([])
    .sort('-created')
    .limit(10)
    .exec(function(err, results){
        ...
    });

回答by Prabhat Yadav

{ $where: "this.pictures.length > 1" }

use the $where and pass the this.field_name.length which return the size of array field and check it by comparing with number. if any array have any value than array size must be at least 1. so all the array field have length more than one, it means it have some data in that array

使用 $where 并传递返回数组字段大小的 this.field_name.length 并通过与数字进行比较来检查它。如果任何数组有任何值,数组大小必须至少为 1。所以所有数组字段的长度都大于 1,这意味着它在该数组中有一些数据