mongoDB 中的字符串字段值长度

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/29577713/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-08 20:23:35  来源:igfitidea点击:

String field value length in mongoDB

mongodbfieldstring-length

提问by SURYA GOKARAJU

The data type of the field is String. I would like to fetch the data where character length of field name is greater than 40.

该字段的数据类型为字符串。我想获取字段名称字符长度大于 40 的数据。

I tried these queries but returning error. 1.

我尝试了这些查询,但返回错误。1.

db.usercollection.find(
{$where: "(this.name.length > 40)"}
).limit(2);

output :error: {
    "$err" : "TypeError: Cannot read property 'length' of undefined near '40)' ",
    "code" : 16722
}

this is working in 2.4.9 But my version is 2.6.5

这适用于 2.4.9 但我的版本是 2.6.5

回答by chridam

For MongoDB 3.6 and newer:

对于 MongoDB 3.6 及更新版本:

The $exproperator allows the use of aggregation expressions within the query language, thus you can leverage the use of $strLenCPoperator to check the length of the string as follows:

$expr运营商将允许查询语言中使用聚合表达式,从而可以利用使用的$strLenCP运营商来检查字符串的长度如下:

db.usercollection.find({ 
    "name": { "$exists": true },
    "$expr": { "$gt": [ { "$strLenCP": "$name" }, 40 ] } 
})


For MongoDB 3.4 and newer:

对于 MongoDB 3.4 及更新版本:

You can also use the aggregation framework with the $redactpipeline operator that allows you to proccess the logical condition with the $condoperator and uses the special operations $$KEEPto "keep" the document where the logical condition is true or $$PRUNEto "remove" the document where the condition was false.

您还可以将聚合框架与$redact管道运算符一起使用,该框架允许您$cond使用运算符处理逻辑条件并使用特殊操作$$KEEP“保留”逻辑条件为真的$$PRUNE文档或“删除”条件为真的文档错误的。

This operation is similar to having a $projectpipeline that selects the fields in the collection and creates a new field that holds the result from the logical condition query and then a subsequent $match, except that $redactuses a single pipeline stage which is more efficient.

此操作类似于拥有一个$project管道,该管道选择集合中的字段并创建一个新字段,该字段保存来自逻辑条件查询的结果,然后是后续$match,不同之处在于$redact使用更有效的单个管道阶段。

As for the logical condition, there are String Aggregation Operatorsthat you can use $strLenCPoperator to check the length of the string. If the length is $gta specified value, then this is a true match and the document is "kept". Otherwise it is "pruned" and discarded.

至于逻辑条件,有字符串聚合运算符,您可以使用$strLenCP运算符来检查字符串的长度。如果长度是$gt一个指定的值,那么这是一个真正的匹配并且文档被“保留”。否则它会被“修剪”并丢弃。



Consider running the following aggregate operation which demonstrates the above concept:

考虑运行以下聚合操作来演示上述概念:

db.usercollection.aggregate([
    { "$match": { "name": { "$exists": true } } },
    {
        "$redact": {
            "$cond": [
                { "$gt": [ { "$strLenCP": "$name" }, 40] },
                "$$KEEP",
                "$$PRUNE"
            ]
        }
    },
    { "$limit": 2 }
])


If using $where, try your query without the enclosing brackets:

如果使用$where,请尝试不带括号的查询:

db.usercollection.find({$where: "this.name.length > 40"}).limit(2);

A better query would be to to check for the field's existence and then check the length:

更好的查询是检查字段是否存在,然后检查长度:

db.usercollection.find({name: {$type: 2}, $where: "this.name.length > 40"}).limit(2); 

or:

或者:

db.usercollection.find({name: {$exists: true}, $where: "this.name.length > 
40"}).limit(2); 

MongoDB evaluates non-$wherequery operations before $whereexpressions and non-$wherequery statements may use an index. A much better performance is to store the length of the string as another field and then you can index or search on it; applying $wherewill be much slower compared to that. It's recommended to use JavaScript expressions and the $whereoperator as a last resort when you can't structure the data in any other way, or when you are dealing with a small subset of data.

MongoDB$where$where表达式和非$where查询语句可能使用索引之前评估非查询操作。更好的性能是将字符串的长度存储为另一个字段,然后您可以对其进行索引或搜索;$where与此相比,申请会慢得多。$where当您无法以任何其他方式构建数据或处理一小部分数据时,建议使用 JavaScript 表达式和运算符作为最后的手段。



A different and faster approach that avoids the use of the $whereoperator is the $regexoperator. Consider the following pattern which searches for

避免使用$where运算符的另一种更快的方法是$regex运算符。考虑以下搜索模式

db.usercollection.find({"name": {"$type": 2, "$regex": /^.{41,}$/}}).limit(2); 

Note- From the docs:

注意- 从文档

If an index exists for the field, then MongoDB matches the regular expression against the values in the index, which can be faster than a collection scan. Further optimization can occur if the regular expression is a “prefix expression”, which means that all potential matches start with the same string. This allows MongoDB to construct a “range” from that prefix and only match against those values from the index that fall within that range.

A regular expression is a “prefix expression” if it starts with a caret (^)or a left anchor (\A), followed by a string of simple symbols. For example, the regex /^abc.*/will be optimized by matching only against the values from the index that start with abc.

Additionally, while /^a/, /^a.*/,and /^a.*$/match equivalent strings, they have different performance characteristics. All of these expressions use an index if an appropriate index exists; however, /^a.*/, and /^a.*$/are slower. /^a/can stop scanning after matching the prefix.

如果该字段存在索引,则 MongoDB 将正则表达式与索引中的值进行匹配,这可能比集合扫描更快。如果正则表达式是“前缀表达式”,则可以进行进一步优化,这意味着所有潜在的匹配项都以相同的字符串开头。这允许 MongoDB 从该前缀构建一个“范围”,并且只匹配来自该范围内的索引的那些值。

如果正则表达式以插入符号(^)或左锚点(\A)开头,后跟一串简单符号,则它是“前缀表达式” 。例如,正则表达式/^abc.*/将通过仅匹配以 开头的索引中的值来优化abc

此外,while/^a/, /^a.*/,/^a.*$/match 等效字符串,它们具有不同的性能特征。如果存在适当的索引,所有这些表达式都使用索引;但是, /^a.*//^a.*$/速度较慢。/^a/匹配前缀后可以停止扫描。

回答by Rajdeep Gautam

Here is one of the way in mongodb you can achieve this.

这是您可以在 mongodb 中实现此目的的方法之一。

db.usercollection.find({ $where: 'this.name.length < 4' })

回答by Fumiya Karasawa

Queries with $whereand $exprare slow if there are too many documents.

如果文档太多,使用$where和查询$expr会很慢。

Using $regexis much faster than $where, $expr.

使用$regex$where,快得多$expr

db.usercollection.find({ 
  "name": /^[\s\S]{40,}$/, // name.length >= 40
})

or 

db.usercollection.find({ 
  "name": { "$regex": "^[\s\S]{40,}$" }, // name.length >= 40
})

This query is the same meaning with

此查询与

db.usercollection.find({ 
  "$where": "this.name && this.name.length >= 40",
})

or

db.usercollection.find({ 
    "name": { "$exists": true },
    "$expr": { "$gte": [ { "$strLenCP": "$name" }, 40 ] } 
})

I tested each queries for my collection.

我测试了我的集合的每个查询。

# find
$where: 10529.359ms
$expr: 5305.801ms
$regex: 2516.124ms

# count
$where: 10872.006ms
$expr: 2630.155ms
$regex: 158.066ms

回答by Udara Gunathilake

I had a similar kind of scenario, but in my case string is not a 1st level attribute. It is inside an object. In here I couldn't find a suitable answer for it. So I thought to share my solution with you all(Hope this will help anyone with a similar kind of problem).

我有类似的场景,但在我的情况下,字符串不是第一级属性。它在一个物体内。在这里我找不到合适的答案。所以我想和大家分享我的解决方案(希望这会帮助任何有类似问题的人)。

Parent Collection 

{
"Child":
{
"name":"Random Name",
"Age:"09"
}
}

Ex: If we need to get only collections that having child's name's length is higher than 10 characters.

例如:如果我们只需要获取孩子名字长度大于 10 个字符的集合。

 db.getCollection('Parent').find({$where: function() { 
for (var field in this.Child.name) { 
    if (this.Child.name.length > 10) 
        return true;

}
}})

回答by Anban

This query will give both field value and length:

此查询将提供字段值和长度:

db.usercollection.aggregate([
{
    $project: {
        "name": 1,
        "length": { $strLenCP: "$name" }
    }} ])