MongoDB：是否可以进行不区分大小写的查询？

Question

提问by Luke Dennis

Example:

例子：

> db.stuff.save({"foo":"bar"});

> db.stuff.find({"foo":"bar"}).count();
1
> db.stuff.find({"foo":"BAR"}).count();
0

Answer 1

回答by rfunduk

You could use a regex.

您可以使用正则表达式。

In your example that would be:

在你的例子中，这将是：

db.stuff.find( { foo: /^bar$/i } );

I must say, though, maybe you could just downcase (or upcase) the value on the way in rather than incurring the extra cost every time you find it. Obviously this wont work for people's names and such, but maybe use-cases like tags.

不过，我必须说，也许你可以在进入的过程中小写（或大写）价值，而不是每次找到它时都产生额外的成本。显然这不适用于人名等，但可能适用于标签等用例。

Answer 2

回答by Dan

UPDATE:

更新：

The original answer is now obsolete. Mongodb now supports advanced full text searching, with many features.

原来的答案现在已经过时了。Mongodb 现在支持高级全文搜索，功能很多。

ORIGINAL ANSWER:

原始答案：

It should be noted that searching with regex's case insensitive /i means that mongodb cannot search by index, so queries against large datasets can take a long time.

需要注意的是，使用正则表达式不区分大小写的 /i 搜索意味着 mongodb 无法通过索引进行搜索，因此针对大型数据集的查询可能需要很长时间。

Even with small datasets, it's not very efficient. You take a far bigger cpu hit than your query warrants, which could become an issue if you are trying to achieve scale.

即使使用小数据集，它也不是很有效。您的 CPU 命中率远高于您的查询保证，如果您试图实现规模化，这可能会成为一个问题。

As an alternative, you can store an uppercase copy and search against that. For instance, I have a User table that has a username which is mixed case, but the id is an uppercase copy of the username. This ensures case-sensitive duplication is impossible (having both "Foo" and "foo" will not be allowed), and I can search by id = username.toUpperCase() to get a case-insensitive search for username.

作为替代方案，您可以存储大写副本并对其进行搜索。例如，我有一个 User 表，它的用户名是大小写混合的，但 id 是用户名的大写副本。这确保区分大小写的重复是不可能的（不允许同时使用“Foo”和“foo”），并且我可以通过 id = username.toUpperCase() 进行搜索以获取不区分大小写的用户名搜索。

If your field is large, such as a message body, duplicating data is probably not a good option. I believe using an extraneous indexer like Apache Lucene is the best option in that case.

如果您的字段很大，例如消息正文，则复制数据可能不是一个好的选择。我相信在这种情况下使用像 Apache Lucene 这样的无关索引器是最好的选择。

Answer 3

回答by Fotios

If you need to create the regexp from a variable, this is a much better way to do it: https://stackoverflow.com/a/10728069/309514

如果您需要从变量创建正则表达式，这是一个更好的方法：https: //stackoverflow.com/a/10728069/309514

You can then do something like:

然后，您可以执行以下操作：

var string = "SomeStringToFind";
var regex = new RegExp(["^", string, "$"].join(""), "i");
// Creates a regex of: /^SomeStringToFind$/i
db.stuff.find( { foo: regex } );

This has the benefit be being more programmatic or you can get a performance boost by compiling it ahead of time if you're reusing it a lot.

这样做的好处是更加程序化，或者如果您经常重用它，则可以通过提前编译来提高性能。

Answer 4

回答by jflaflamme

Keep in mind that the previous example:

请记住，前面的示例：

db.stuff.find( { foo: /bar/i } );

will cause every entries containing barto match the query ( bar1, barxyz, openbar ), it could be very dangerous for a username search on a auth function ...

将导致包含bar 的每个条目都匹配查询（ bar1, barxyz, openbar ），这对于在身份验证功能上进行用户名搜索可能非常危险...

You may need to make it match only the search term by using the appropriate regexp syntax as:

您可能需要使用适当的正则表达式语法使其仅匹配搜索词：

db.stuff.find( { foo: /^bar$/i } );

See http://www.regular-expressions.info/for syntax help on regular expressions

有关正则表达式的语法帮助，请参阅http://www.regular-expressions.info/

Answer 5

回答by user3413723

Starting with MongoDB 3.4, the recommended way to perform fast case-insensitive searches is to use a Case Insensitive Index.

从 MongoDB 3.4 开始，执行快速不区分大小写搜索的推荐方法是使用Case Insensitive Index。

I personally emailed one of the founders to please get this working, and he made it happen! It was an issue on JIRA since 2009, and many have requested the feature. Here's how it works:

我亲自给其中一位创始人发了电子邮件，请他完成这项工作，他做到了！自 2009 年以来，这是JIRA 上的一个问题，许多人要求使用该功能。这是它的工作原理：

A case-insensitive index is made by specifying a collationwith a strength of either 1 or 2. You can create a case-insensitive index like this:

通过指定强度为 1 或 2的排序规则来创建不区分大小写的索引。您可以像这样创建不区分大小写的索引：

db.cities.createIndex(
  { city: 1 },
  { 
    collation: {
      locale: 'en',
      strength: 2
    }
  }
);

You can also specify a default collation per collection when you create them:

您还可以在创建集合时为每个集合指定默认排序规则：

db.createCollection('cities', { collation: { locale: 'en', strength: 2 } } );

In either case, in order to use the case-insensitive index, you need to specify the same collation in the findoperation that was used when creating the index or the collection:

无论哪种情况，为了使用不区分大小写的索引，您都需要find在创建索引或集合时使用的操作中指定相同的排序规则：

db.cities.find(
  { city: 'new york' }
).collation(
  { locale: 'en', strength: 2 }
);

This will return "New York", "new york", "New york" etc.

这将返回“纽约”、“纽约”、“纽约”等。

Other notes

其他注意事项

The answers suggesting to use full-text search are wrongin this case (and potentially dangerous). The question was about making a case-insensitive query, e.g. username: 'bill'matching BILLor Bill, not a full-text search query, which would also match stemmedwords of bill, such as Bills, billedetc.
The answers suggesting to use regular expressions are slow, because even with indexes, the documentation states:
"Case insensitive regular expression queries generally cannot use indexes effectively. The $regex implementation is not collation-aware and is unable to utilize case-insensitive indexes."
$regexanswers also run the risk of user input injection.

在这种情况下，建议使用全文搜索的答案是错误的（并且有潜在危险）。现在的问题是关于做一个不区分大小写的查询，如username: 'bill'匹配BILL或Bill不完整的文本搜索查询，这也将匹配朵朵的话bill，比如Bills，billed等等。
建议使用正则表达式的答案很慢，因为即使使用索引，文档说明：
“不区分大小写的正则表达式查询通常不能有效地使用索引。$regex 实现不支持排序规则，无法使用不区分大小写的索引。”
$regex答案还存在用户输入注入的风险。

Answer 6

回答by rshivamca

db.zipcodes.find({city : "NEW YORK"}); // Case-sensitive
db.zipcodes.find({city : /NEW york/i}); // Note the 'i' flag for case-insensitivity

Answer 7

回答by vijay

TL;DR

TL; 博士

Correct way to do this in mongo

在 mongo 中执行此操作的正确方法

Do not Use RegExp

不要使用正则表达式

Go natural And use mongodb's inbuilt indexing , search

顺其自然并使用 mongodb 的内置索引，搜索

Step 1 :

第1步：

db.articles.insert(
   [
     { _id: 1, subject: "coffee", author: "xyz", views: 50 },
     { _id: 2, subject: "Coffee Shopping", author: "efg", views: 5 },
     { _id: 3, subject: "Baking a cake", author: "abc", views: 90  },
     { _id: 4, subject: "baking", author: "xyz", views: 100 },
     { _id: 5, subject: "Café Con Leche", author: "abc", views: 200 },
     { _id: 6, subject: "Сырники", author: "jkl", views: 80 },
     { _id: 7, subject: "coffee and cream", author: "efg", views: 10 },
     { _id: 8, subject: "Cafe con Leche", author: "xyz", views: 10 }
   ]
)

Step 2 :

第2步：

Need to create index on whichever TEXTfield you want to search , without indexing query will be extremely slow

需要在要搜索的任何TEXT字段上创建索引，没有索引的查询将非常慢

db.articles.createIndex( { subject: "text" } )

step 3 :

第 3 步：

db.articles.find( { $text: { $search: "coffee",$caseSensitive :true } } )  //FOR SENSITIVITY
db.articles.find( { $text: { $search: "coffee",$caseSensitive :false } } ) //FOR INSENSITIVITY

Answer 8

回答by Nilesh

db.company_profile.find({ "companyName" : { "$regex" : "Nilesh" , "$options" : "i"}});

Answer 9

回答by Aidan Feldman

Mongo (current version 2.0.0) doesn't allow case-insensitive searches against indexed fields - see their documentation. For non-indexed fields, the regexes listed in the other answers should be fine.

Mongo（当前版本 2.0.0）不允许对索引字段进行不区分大小写的搜索 - 请参阅他们的文档。对于非索引字段，其他答案中列出的正则表达式应该没问题。

Answer 10

回答by Nick Kamer

One very important thing to keep in mind when using a Regex based query - When you are doing this for a login system, escape every single characteryou are searching for, and don't forget the ^ and $ operators. Lodash has a nice function for this, should you be using it already:

使用基于正则表达式的查询时要记住的一件非常重要的事情 - 当您为登录系统执行此操作时，请转义您正在搜索的每个字符，并且不要忘记 ^ 和 $ 运算符。Lodash 有一个很好的函数 this，如果你已经在使用它：

db.stuff.find({$regex: new RegExp(_.escapeRegExp(bar), $options: 'i'})

Why? Imagine a user entering .*as his username. That would match all usernames, enabling a login by just guessing any user's password.

为什么？想象一个用户.*作为他的用户名输入。这将匹配所有用户名，只需猜测任何用户的密码即可登录。

MongoDB：是否可以进行不区分大小写的查询？

提问by Luke Dennis

回答by rfunduk

回答by Dan

回答by Fotios

回答by jflaflamme

回答by user3413723

Other notes

其他注意事项

回答by rshivamca

回答by vijay

Correct way to do this in mongo

在 mongo 中执行此操作的正确方法

Step 1 :

第1步：

Step 2 :

第2步：

step 3 :

第 3 步：

回答by Nilesh

回答by Aidan Feldman

回答by Nick Kamer

相关推荐

最近更新

标签

MongoDB：是否可以进行不区分大小写的查询？

提问by Luke Dennis

回答by rfunduk

回答by Dan

回答by Fotios

回答by jflaflamme

回答by user3413723

Other notes

其他注意事项

回答by rshivamca

回答by vijay

Correct way to do this in mongo

在 mongo 中执行此操作的正确方法

Step 1 :

第1步 ：

Step 2 :

第2步 ：

step 3 :

第 3 步：

回答by Nilesh

回答by Aidan Feldman

回答by Nick Kamer

相关推荐

windows 将证书导入受信任的根但不导入个人 [命令行]

无法使用自签名证书在 Windows 上使用 git 解决“无法获得本地颁发者证书”

windows 批处理文件以°C 为单位获取 CPU 温度并设置为变量

windows 如何获取 Internet Explorer 支持的 SSL/TLS 密码列表

相关推荐

最近更新

标签

第1步：

第2步：