mongoDB 中的字符串字段值长度
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/29577713/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
String field value length in mongoDB
提问by SURYA GOKARAJU
The data type of the field is String. I would like to fetch the data where character length of field name is greater than 40.
该字段的数据类型为字符串。我想获取字段名称字符长度大于 40 的数据。
I tried these queries but returning error. 1.
我尝试了这些查询,但返回错误。1.
db.usercollection.find(
{$where: "(this.name.length > 40)"}
).limit(2);
output :error: {
"$err" : "TypeError: Cannot read property 'length' of undefined near '40)' ",
"code" : 16722
}
this is working in 2.4.9 But my version is 2.6.5
这适用于 2.4.9 但我的版本是 2.6.5
回答by chridam
For MongoDB 3.6 and newer:
对于 MongoDB 3.6 及更新版本:
The $expr
operator allows the use of aggregation expressions within the query language, thus you can leverage the use of $strLenCP
operator to check the length of the string as follows:
该$expr
运营商将允许查询语言中使用聚合表达式,从而可以利用使用的$strLenCP
运营商来检查字符串的长度如下:
db.usercollection.find({
"name": { "$exists": true },
"$expr": { "$gt": [ { "$strLenCP": "$name" }, 40 ] }
})
For MongoDB 3.4 and newer:
对于 MongoDB 3.4 及更新版本:
You can also use the aggregation framework with the $redact
pipeline operator that allows you to proccess the logical condition with the $cond
operator and uses the special operations $$KEEP
to "keep" the document where the logical condition is true or $$PRUNE
to "remove" the document where the condition was false.
您还可以将聚合框架与$redact
管道运算符一起使用,该框架允许您$cond
使用运算符处理逻辑条件并使用特殊操作$$KEEP
“保留”逻辑条件为真的$$PRUNE
文档或“删除”条件为真的文档错误的。
This operation is similar to having a $project
pipeline that selects the fields in the collection and creates a new field that holds the result from the logical condition query and then a subsequent $match
, except that $redact
uses a single pipeline stage which is more efficient.
此操作类似于拥有一个$project
管道,该管道选择集合中的字段并创建一个新字段,该字段保存来自逻辑条件查询的结果,然后是后续$match
,不同之处在于$redact
使用更有效的单个管道阶段。
As for the logical condition, there are String Aggregation Operatorsthat you can use $strLenCP
operator to check the length of the string. If the length is $gt
a specified value, then this is a true match and the document is "kept". Otherwise it is "pruned" and discarded.
至于逻辑条件,有字符串聚合运算符,您可以使用$strLenCP
运算符来检查字符串的长度。如果长度是$gt
一个指定的值,那么这是一个真正的匹配并且文档被“保留”。否则它会被“修剪”并丢弃。
Consider running the following aggregate operation which demonstrates the above concept:
考虑运行以下聚合操作来演示上述概念:
db.usercollection.aggregate([
{ "$match": { "name": { "$exists": true } } },
{
"$redact": {
"$cond": [
{ "$gt": [ { "$strLenCP": "$name" }, 40] },
"$$KEEP",
"$$PRUNE"
]
}
},
{ "$limit": 2 }
])
If using $where
, try your query without the enclosing brackets:
如果使用$where
,请尝试不带括号的查询:
db.usercollection.find({$where: "this.name.length > 40"}).limit(2);
A better query would be to to check for the field's existence and then check the length:
更好的查询是检查字段是否存在,然后检查长度:
db.usercollection.find({name: {$type: 2}, $where: "this.name.length > 40"}).limit(2);
or:
或者:
db.usercollection.find({name: {$exists: true}, $where: "this.name.length >
40"}).limit(2);
MongoDB evaluates non-$where
query operations before $where
expressions and non-$where
query statements may use an index. A much better performance is to store the length of the string as another field and then you can index or search on it; applying $where
will be much slower compared to that. It's recommended to use JavaScript expressions and the $where
operator as a last resort when you can't structure the data in any other way, or when you are dealing with a
small subset of data.
MongoDB$where
在$where
表达式和非$where
查询语句可能使用索引之前评估非查询操作。更好的性能是将字符串的长度存储为另一个字段,然后您可以对其进行索引或搜索;$where
与此相比,申请会慢得多。$where
当您无法以任何其他方式构建数据或处理一小部分数据时,建议使用 JavaScript 表达式和运算符作为最后的手段。
A different and faster approach that avoids the use of the $where
operator is the $regex
operator. Consider the following pattern which searches for
避免使用$where
运算符的另一种更快的方法是$regex
运算符。考虑以下搜索模式
db.usercollection.find({"name": {"$type": 2, "$regex": /^.{41,}$/}}).limit(2);
Note- From the docs:
注意- 从文档:
If an index exists for the field, then MongoDB matches the regular expression against the values in the index, which can be faster than a collection scan. Further optimization can occur if the regular expression is a “prefix expression”, which means that all potential matches start with the same string. This allows MongoDB to construct a “range” from that prefix and only match against those values from the index that fall within that range.
A regular expression is a “prefix expression” if it starts with a caret
(^)
or a left anchor(\A)
, followed by a string of simple symbols. For example, the regex/^abc.*/
will be optimized by matching only against the values from the index that start withabc
.Additionally, while
/^a/, /^a.*/,
and/^a.*$/
match equivalent strings, they have different performance characteristics. All of these expressions use an index if an appropriate index exists; however,/^a.*/
, and/^a.*$/
are slower./^a/
can stop scanning after matching the prefix.
如果该字段存在索引,则 MongoDB 将正则表达式与索引中的值进行匹配,这可能比集合扫描更快。如果正则表达式是“前缀表达式”,则可以进行进一步优化,这意味着所有潜在的匹配项都以相同的字符串开头。这允许 MongoDB 从该前缀构建一个“范围”,并且只匹配来自该范围内的索引的那些值。
如果正则表达式以插入符号
(^)
或左锚点(\A)
开头,后跟一串简单符号,则它是“前缀表达式” 。例如,正则表达式/^abc.*/
将通过仅匹配以 开头的索引中的值来优化abc
。此外,while
/^a/, /^a.*/,
和/^a.*$/
match 等效字符串,它们具有不同的性能特征。如果存在适当的索引,所有这些表达式都使用索引;但是,/^a.*/
和/^a.*$/
速度较慢。/^a/
匹配前缀后可以停止扫描。
回答by Rajdeep Gautam
Here is one of the way in mongodb you can achieve this.
这是您可以在 mongodb 中实现此目的的方法之一。
db.usercollection.find({ $where: 'this.name.length < 4' })
回答by Fumiya Karasawa
Queries with $where
and $expr
are slow if there are too many documents.
如果文档太多,使用$where
和查询$expr
会很慢。
Using $regex
is much faster than $where
, $expr
.
使用$regex
比$where
,快得多$expr
。
db.usercollection.find({
"name": /^[\s\S]{40,}$/, // name.length >= 40
})
or
db.usercollection.find({
"name": { "$regex": "^[\s\S]{40,}$" }, // name.length >= 40
})
This query is the same meaning with
此查询与
db.usercollection.find({
"$where": "this.name && this.name.length >= 40",
})
or
db.usercollection.find({
"name": { "$exists": true },
"$expr": { "$gte": [ { "$strLenCP": "$name" }, 40 ] }
})
I tested each queries for my collection.
我测试了我的集合的每个查询。
# find
$where: 10529.359ms
$expr: 5305.801ms
$regex: 2516.124ms
# count
$where: 10872.006ms
$expr: 2630.155ms
$regex: 158.066ms
回答by Udara Gunathilake
I had a similar kind of scenario, but in my case string is not a 1st level attribute. It is inside an object. In here I couldn't find a suitable answer for it. So I thought to share my solution with you all(Hope this will help anyone with a similar kind of problem).
我有类似的场景,但在我的情况下,字符串不是第一级属性。它在一个物体内。在这里我找不到合适的答案。所以我想和大家分享我的解决方案(希望这会帮助任何有类似问题的人)。
Parent Collection
{
"Child":
{
"name":"Random Name",
"Age:"09"
}
}
Ex: If we need to get only collections that having child's name's length is higher than 10 characters.
例如:如果我们只需要获取孩子名字长度大于 10 个字符的集合。
db.getCollection('Parent').find({$where: function() {
for (var field in this.Child.name) {
if (this.Child.name.length > 10)
return true;
}
}})
回答by Anban
This query will give both field value and length:
此查询将提供字段值和长度:
db.usercollection.aggregate([
{
$project: {
"name": 1,
"length": { $strLenCP: "$name" }
}} ])