mongodb Pymongo:迭代集合中的所有文档

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/40899091/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-08 20:51:17  来源:igfitidea点击:

Pymongo: iterate over all documents in the collection

mongodbcursorpymongo

提问by mel

I am using PyMongo and trying to iterate over (10 millions) documents in my MongoDB collection and just extract a couple of keys: "name" and "address", then output them to .csv file.

我正在使用 PyMongo 并尝试在我的 MongoDB 集合中迭代(1000 万)个文档,并只提取几个键:“名称”和“地址”,然后将它们输出到 .csv 文件。

I cannot figure out the right syntax to do it with find().forEach()

我无法找出正确的语法来使用 find().forEach()

I was trying workarounds like

我正在尝试类似的解决方法

   cursor = db.myCollection.find({"name": {$regex: REGEX}})

where REGEX would match everything - and it resulted in "Killed". I also tried

REGEX 将匹配所有内容的地方 - 结果是“Killed”。我也试过

   cursor = db.myCollection.find({"name": {"$exist": True}})

but that did not work either.

但这也不起作用。

Any suggestions?

有什么建议?

回答by Wan Bachtiar

I cannot figure out the right syntax to do it with find().forEach()

我无法找出正确的语法来使用 find().forEach()

cursor.forEach()is not available for Python, it's a JavaScript function. You would have to get a cursor and iterate over it. See PyMongo Tutorial: querying for more than one document, where you can do :

cursor.forEach()不适用于 Python,它是一个 JavaScript 函数。您必须获得一个游标并对其进行迭代。请参阅PyMongo 教程:查询多个文档,您可以在其中执行以下操作:

for document in myCollection.find():
    print(document) # iterate the cursor

where REGEX would match everything - and it resulted in "Killed".

REGEX 将匹配所有内容的地方 - 结果是“Killed”。

Unfortunately there's lack of information here to debug on why and what 'Killed' is. Although if you would like to match everything, you can just state:

不幸的是,这里缺乏信息来调试“杀死”的原因和内容。虽然如果您想匹配所有内容,您可以声明:

cursor = db.myCollection.find({"name": {$regex: /.*/}}) 

Given that field namecontains string values. Although using $existsto check whether field nameexists would be preferable than using regex.

鉴于该字段name包含字符串值。尽管使用$exists检查字段是否name存在比使用正则表达式更可取。

While the use of $existsoperator in your example above is incorrect. You're missing an sin $exists. Again, unfortunately we don't know much information on what 'didn't work' meant to help debug further.

虽然在上面的示例中使用$exists运算符是不正确的。你缺少一个sin $exists。同样,不幸的是,我们对“无效”意味着帮助进一步调试的信息知之甚少。

If you're writing this script for Python exercise, I would recommend to review:

如果您正在为 Python 练习编写此脚本,我建议您查看:

You could also enrol in a free online course at MongoDB Universityfor M101P: MongoDB for Python Developers.

您还可以在MongoDB 大学M101P注册免费的在线课程:MongoDB for Python Developers

However, if you are just trying to accomplish your task of exporting CSV from a collection. As an alternative you could just use MongoDB's mongoexport. Which has the support for :

但是,如果您只是想完成从集合中导出 CSV 的任务。作为替代方案,您可以使用 MongoDB 的mongoexport。其中有以下支持:

See mongoexport usagefor more information.

有关更多信息,请参阅mongoexport 用法

回答by GodIsAnAstronaut

I had no luck with .find().forEach() either, but this should find what you are searching for and then print it.

我对 .find().forEach() 也不走运,但这应该会找到您要搜索的内容,然后将其打印出来。

Firstfind all documents that match what you are searching for

首先找到与您要搜索的内容匹配的所有文档

cursors = db.myCollection.find({"name": {$regex: REGEX}})

theniterate it over the matches

然后遍历匹配项

for cursor in cursors
    print(cursor.get("name"))

回答by Confidenc3

The find()methods returns a PyMongocursor, which is a reference to the result set of a query.

这些find()方法返回一个PyMongo游标,它是对查询结果集的引用。

You have to de-reference, somehow, the reference(address).

您必须以某种方式取消引用引用(地址)。

After that, you will get a better understanding how to manipulate/manage the cursor.

之后,您将更好地了解如何操作/管理光标。

Try the following for a start:

首先尝试以下操作:

result = db.*collection_name*.find()
print(list(result))