javascript 如何在 couchdb 中创建“喜欢”过滤器视图
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5509911/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
how do I create a "like" filter view in couchdb
提问by yuda
Here's an example of what I need in sql:
这是我在 sql 中需要的示例:
SELECT name FROM employ WHERE name LIKE %bro%
SELECT 名称 FROM 雇用 WHERE 名称 LIKE %bro%
How do I create view like that in couchdb?
如何在 couchdb 中创建这样的视图?
采纳答案by natevw
The simple answer is that CouchDB views aren't ideal for this.
简单的答案是 CouchDB 视图不适合这个。
The more complicated answer is that this type of query tends to be very inefficient in typical SQL engines too, and so if you grant that there will be tradeoffs with anysolution then CouchDB actually has the benefit of letting you choose your tradeoff.
更复杂的答案是,这种类型的查询在典型的 SQL 引擎中也往往非常低效,因此如果您承认任何解决方案都会有权衡,那么 CouchDB 实际上具有让您选择权衡的好处。
1. The SQL Ways
1. SQL方式
When you do SELECT ... WHERE name LIKE %bro%
, all the SQL engines I'm familiar with must do what's called a "full table scan". This means the server reads every row in the relevant table, and brute force scans the field to see if it matches.
当您这样做时SELECT ... WHERE name LIKE %bro%
,我熟悉的所有 SQL 引擎都必须执行所谓的“全表扫描”。这意味着服务器读取相关表中的每一行,并暴力扫描该字段以查看它是否匹配。
You can do this in CouchDB 2.x with a Mango query using the $regex
operator. The query would look something like this for the basic case:
您可以在 CouchDB 2.x 中通过使用$regex
运算符的 Mango 查询来执行此操作。对于基本情况,查询将如下所示:
{"selector":{
"name": {
"$regex": "bro"
}
}}
There do not appear to be any options exposed for case-sensitivity, etc. but you could extend it to match only at the beginning/end or more complicated patterns. If you can also restrict your query via some other (indexable) field operator, that would likely help performance. As the documentation warns:
似乎没有为区分大小写等公开任何选项,但您可以扩展它以仅在开头/结尾或更复杂的模式匹配。如果您还可以通过其他一些(可索引的)字段运算符来限制您的查询,那可能会有助于提高性能。正如文档所警告的那样:
Regular expressions do not work with indexes, so they should not be used to filter large data sets. […]
正则表达式不适用于索引,因此它们不应用于过滤大型数据集。[…]
You can do a full scan in CouchDB 1.x too, using a temporary view:
您也可以使用临时视图在 CouchDB 1.x 中进行完整扫描:
POST /some_database/_temp_view
{"map": "function (doc) { if (doc.name && doc.name.indexOf('bro') !== -1) emit(null); }"}
This will look through every single document in the database and give you a list of matching documents. You can tweak the map function to also match on a document type, or to emit with a certain key for ordering — emit(doc.timestamp)
— or some data value useful to your purpose — emit(null, doc.name)
.
这将查看数据库中的每个文档,并为您提供匹配文档的列表。您可以调整 map 函数以匹配文档类型,或使用特定键发出以进行排序 — emit(doc.timestamp)
— 或某些对您的目的有用的数据值 — emit(null, doc.name)
。
2. The "tons of disk space available" way
2. “大量可用磁盘空间”方式
Depending on your source data size you could create an index that emits every possible "interior string" as its permanent (on-disk) view key. That is to say for a name like "Dobros" you would emit("dobros"); emit("obros"); emit("bros"); emit("ros"); emit("os"); emit("s");
. Then for a term like '%bro%' you could query your view with startkey="bro"&endkey="bro\uFFFF"
to get all occurrences of the lookup term. Your index will be approximately the size of your text content squared, but if you need to do an arbitrary "find in string" faster than the full DB scan above and have the space this might work. You'd be better served by a data structure designed for substring searchingthough.
根据您的源数据大小,您可以创建一个索引,将每个可能的“内部字符串”作为其永久(磁盘上)视图键。也就是说,对于像“Dobros”这样的名字,您会这样做emit("dobros"); emit("obros"); emit("bros"); emit("ros"); emit("os"); emit("s");
。然后对于像 '%bro%' 这样的术语,您可以查询您的视图startkey="bro"&endkey="bro\uFFFF"
以获取所有出现的查找术语。您的索引将大约是您的文本内容的大小squared,但如果您需要比上面的完整数据库扫描更快地执行任意“字符串查找”并且有空间,这可能会起作用。不过,为子字符串搜索设计的数据结构会更好地为您服务。
Which brings us too...
这也给我们带来了...
3. The Full Text Search way
3.全文检索方式
You could use a CouchDB plugin (couchdb-lucenenow via Dreyfus/Clouseaufor 2.x, ElasticSearch, SQLite's FTS) to generate an auxiliary text-oriented index into your documents.
您可以使用 CouchDB 插件(现在通过Dreyfus/Clouseaufor 2.x、ElasticSearch、SQLite 的 FTS使用couchdb-lucene)为您的文档生成一个辅助的面向文本的索引。
Note that most full text search indexes don't naturally supportarbitrary wildcard prefixes either, likely for similar reasons of space efficiency as we saw above. Usually full text search doesn't imply "brute force binary search", but "word search". YMMV though, take a look around at the options available in your full text engine.
请注意,大多数全文搜索索引也不自然地支持任意通配符前缀,这可能是出于与我们上面看到的空间效率类似的原因。通常全文搜索并不意味着“暴力二进制搜索”,而是“单词搜索”。不过,YMMV,请查看全文引擎中可用的选项。
If you don't really need to find "bro" anywherein a field, you can implement basic "find a word starting with X" search with regular CouchDB views by just splitting on various locale-specific word separators and omitting these "words" as your view keys. This will be more efficient than above, scaling proportionally to the amount of data indexed.
如果您真的不需要在字段中的任何位置找到“bro” ,您可以使用常规 CouchDB 视图实现基本的“查找以 X 开头的单词”搜索,只需拆分各种特定于语言环境的单词分隔符并省略这些“单词”作为您的查看键。这将比上述更有效,与索引的数据量成比例地扩展。
回答by Dominic Barnes
Unfortunately, doing searches using LIKE %...%
aren't really how CouchDB Views work, but you can accomplish a great deal of search capability by installing couchdb-lucene, it's a fulltext search engine that creates indexes on your database that you can do more sophisticated searches with.
不幸的是,使用LIKE %...%
CouchDB 视图进行搜索并不是真正的工作方式,但是您可以通过安装couchdb-lucene来实现大量搜索功能,它是一个全文搜索引擎,可以在您的数据库上创建索引,您可以使用它进行更复杂的搜索。
The typical way to "search" a database for a given key, without any 3rd party tools, is to create a view that emits the value you are looking for as the key. In your example:
在没有任何 3rd 方工具的情况下,为给定键“搜索”数据库的典型方法是创建一个视图,将您正在寻找的值作为键。在你的例子中:
function (doc) {
emit(doc.name, doc);
}
This outputs a list of all the names in your database.
这将输出数据库中所有名称的列表。
Now, you would "search" based on the first letters of your key. For example, if you are searching for names that start with "bro".
现在,您将根据密钥的第一个字母“搜索”。例如,如果您要搜索以“bro”开头的名称。
/db/_design/test/_view/names?startkey="bro"&endkey="brp"
Notice I took the last letter of the search parameter, and "incremented" the last letter in it. Again, if you want to perform searches, rather than aggregating statistics, you should use a fulltext search engine like lucene. (see above)
请注意,我取了搜索参数的最后一个字母,并“递增”了其中的最后一个字母。同样,如果您想执行搜索,而不是聚合统计信息,您应该使用像 lucene 这样的全文搜索引擎。(往上看)
回答by yuda
i found a simple view codefor my problem...
我为我的问题找到了一个简单的视图代码...
{
"getavailableproduct":
{ "map": "function(doc) {
var prefix = doc['productid'].match(/[A-Za-z0-9]+/g);
if(prefix)
for(var pre in prefix) { emit(prefix[pre],null); }
}"
}
}
{
"getavailableproduct":
{ "map": "function(doc) {
var prefix = doc['productid'].match(/[A-Za-z0-9]+/g);
if(prefix)
for(var前置前缀) { 发射(前缀[前置],null); }
}"
}
}
from this view code if i split a key sentence into a key word... and i can call
从这个视图代码如果我把一个关键句子分成一个关键词......我可以打电话
?key="[search_keyword]"
?key= "[搜索关键字]"
but i need more complex code because if i run this code i can only find word wich i type (ex: eat, food, etc)...
但我需要更复杂的代码,因为如果我运行这段代码,我只能找到我输入的单词(例如:吃、食物等)...
but if i want to type not a complete word (ex: ea from eat, or foo from food) that code does not work..
但是如果我不想输入一个完整的单词(例如:ea 来自eat,或foo 来自food),该代码将不起作用..
回答by Victor Bermudez
I know it is an old question, but: What about using a "list" function? You can have all your normal views, andthen add a "list" function to the design document to process the view's results:
我知道这是一个老问题,但是:使用“列表”函数怎么样?您可以拥有所有普通视图,然后在设计文档中添加一个“列表”功能来处理视图的结果:
{
"_id": "_design/...",
"views": {
"employees": "..."
},
"lists": {
"by_name": "..."
}
}
And the function attached to "by_name" function, should be something like:
附加到“by_name”函数的函数应该是这样的:
function (head, req) {
provides('json', function() {
var filtered = [];
while (row = getRow()) {
// We can retrive all row information from the view
var key = row.key;
var value = row.value;
var doc = req.query.include_docs ? row.doc : {};
if (value.name.indexOf(req.query.name) == 0) {
if (req.query.include_docs) {
filtered.push({ key: key, value: value, doc: doc});
} else {
filtered.push({ key: key, value: value});
}
}
}
return toJSON({ total_rows: filtered.length, rows: filtered });
});
}
You can, of course, use regular expressions too. It's not a perfect solution, but it works to me.
当然,您也可以使用正则表达式。这不是一个完美的解决方案,但它对我有用。
回答by amustapha
You can use regular expressions. As per this tableyou can write something like this to return any id that contains "SMS".
您可以使用正则表达式。根据此表,您可以编写类似的内容以返回包含“SMS”的任何 id。
{
"selector": {
"_id": {
"$regex": "sms"
}
}
}
Basic regex you can use on that includes
您可以使用的基本正则表达式包括
"^sms" roughly to LIKE "%sms"
"sms$" roughly to LIKE "sms%"
You can read more on regular expressions here
回答by fet
You could emit your documents like normal. emit(doc.name, null);
I would throw a toLowerCase()
on that name
to remove case sensitivity.
你可以像往常一样发出你的文件。emit(doc.name, null);
我会扔toLowerCase()
上name
删除区分大小写。
and then query the view with a slew of keys to see if something "like" the query shows up.
然后使用一系列键查询视图以查看是否出现“类似于”查询的内容。
keys = differentVersions("bro"); // returns ["bro", "br", "bo", "ro", "cro", "dro", ..., "zro"]
$.couch("db").view("employeesByName", { keys: keys, success: dealWithIt } )
Some considerations
一些注意事项
Obviously that array can get really big really fast depending on what
differentVersions
returns. You might hit a post data limit at some point or conceivably get slow lookups.The results are only as good as
differentVersions
is at giving you guesses for what the person meant to spell. Obviously this function can be as simple or complex as you like. In this example I tried two strategies, a) removed a letter and pushed that, and b) replaced the letter at position n with all other letters. So if someone had been looking for "bro" but typed in "gro" or "bri" or even "bgro",differentVersions
would have permuted that to "bro" at some point.While not ideal, it's still pretty fast since a look up in Couch's b-trees is fast.
显然,根据
differentVersions
返回的内容,该数组可以非常快地变得非常大。您可能会在某个时候达到发布数据限制,或者可能会导致查找速度变慢。结果只能
differentVersions
让您猜测这个人的意思是拼写。显然,此功能可以根据您的喜好简单或复杂。在这个例子中,我尝试了两种策略,a) 删除一个字母并推送它,b) 用所有其他字母替换位置 n 的字母。因此,如果有人一直在寻找“bro”但输入了“gro”或“bri”甚至“bgro”,differentVersions
那么在某个时候就会将其置换为“bro”。虽然不理想,但它仍然相当快,因为在 Couch 的 b 树中查找很快。
回答by Imad
why cann't we just use indexOf() in view?
为什么我们不能只在视图中使用 indexOf() ?