在 Python 中读取 BSON 文件?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/27527982/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 01:55:52  来源:igfitidea点击:

Read BSON file in Python?

pythonmongodbbson

提问by Richard

I want to read a BSON format Mongo dump in Python and process the data. I am using the Python bson package(which I'd prefer to use rather than have a pymongo dependency), but it doesn't explain how to read from a file.

我想在 Python 中读取 BSON 格式的 Mongo 转储并处理数据。我正在使用 Python bson 包(我更喜欢使用它而不是 pymongo 依赖项),但它没有解释如何从文件中读取。

This is what I'm trying:

这就是我正在尝试的:

bson_file = open('statistics.bson', 'rb')
b = bson.loads(bson_file)
print b[0]

But I get:

但我得到:

Traceback (most recent call last):
  File "test.py", line 11, in <module>
    b = bson.loads(bson_file)
  File "/Library/Python/2.7/site-packages/bson/__init__.py", line 75, in loads
    return decode_document(data, 0)[1]
  File "/Library/Python/2.7/site-packages/bson/codec.py", line 235, in decode_document
    length = struct.unpack("<i", data[base:base + 4])[0]
TypeError: 'file' object has no attribute '__getitem__'

What am I doing wrong?

我究竟做错了什么?

采纳答案by njzk2

The documentation states :

该文件指出:

> help(bson.loads)
Given a BSON string, outputs a dict.

You need to pass a string. For example:

您需要传递一个字符串。例如:

> b = bson.loads(bson_file.read())

回答by Wander Nauta

loadsexpects a string (that's what the 's' stands for), not a file. Try reading from the file, and passing the result to loads.

loads需要一个字符串(这就是 's' 代表的意思),而不是一个文件。尝试从文件中读取,并将结果传递给loads.

回答by Marc Maxmeister

I found this worked for me with a mongodb 2.4 BSON file and python's 'bson' module:

我发现这对我有用 mongodb 2.4 BSON 文件和 python 的“bson”模块:

import bson
with open('survey.bson','rb') as f:
    data = bson.decode_all(f.read())

That returned a list of dictionaries matching the JSON documents stored in that mongo collection.

这将返回与存储在该 mongo 集合中的 JSON 文档匹配的字典列表。

The f.read() data looks like this in a BSON:

f.read() 数据在 BSON 中如下所示:

>>> rawdata[:100]
'\x04\x01\x00\x00\x12_id\x00\x01\x00\x00\x00\x00\x00\x00\x00\x02_type\x00\x07\x00\x00\x00simple\x00\tchanged\x00\xd0\xbb\xb2\x9eI\x01\x00\x00\tcreated\x00\xd0L\xdcfI\x01\x00\x00\x02description\x00\x14\x00\x00\x00testing the bu'