使用 mongoimport 将文件中的 json 导入 mongodb

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/30380751/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-08 20:27:02  来源:igfitidea点击:

Importing json from file into mongodb using mongoimport

mongodb

提问by Cybernetic

I have my json_file.json like this:

我有这样的 json_file.json:

[
{
    "project": "project_1",
    "coord1": 2,
    "coord2": 10,
    "status": "yes",
    "priority": 7
},
{
    "project": "project_2",
    "coord1": 2,
    "coord2": 10,
    "status": "yes",
    "priority": 7
},
{
    "project": "project_3",
    "coord1": 2,
    "coord2": 10,
    "status": "yes",
    "priority": 7
}
]

When I run the following command to import this into mongodb:

当我运行以下命令将其导入 mongodb 时:

mongoimport --db my_db --collection my_collection --file json_file.json 

I get the following error:

我收到以下错误:

Failed: error unmarshaling bytes on document #0: JSON decoder out of sync - data changing underfoot?

If I add the --jsonArray flag to the command I import like this:

如果我将 --jsonArray 标志添加到我像这样导入的命令中:

imported 3 documents

instead of one document with the json format as shown in the original file.

而不是原始文件中显示的 json 格式的一个文档。

How can I import json into mongodb with the original format in the file shown above?

如何将上面显示的文件中的原始格式的json导入到mongodb中?

回答by Alex Glukhovtsev

The mongoimporttool has an option:

mongoimport工具有一个选项:

--jsonArraytreat input source as a JSON array

--jsonArray将输入源视为 JSON 数组

Or it is possible to import from file containing same data format as the result of db.collection.find()command. Here is example from university.mongodb.comcourseware some content from grades.json:

或者可以从包含与db.collection.find()命令结果相同的数据格式的文件中导入。以下是来自university.mongodb.com课件的示例,部分内容来自grades.json

{ "_id" : { "$oid" : "50906d7fa3c412bb040eb577" }, "student_id" : 0, "type" : "exam", "score" : 54.6535436362647 }
{ "_id" : { "$oid" : "50906d7fa3c412bb040eb578" }, "student_id" : 0, "type" : "quiz", "score" : 31.95004496742112 }
{ "_id" : { "$oid" : "50906d7fa3c412bb040eb579" }, "student_id" : 0,       "type" : "homework", "score" : 14.8504576811645 }

As you can see, no array used and no comma delimiters between documents either.

如您所见,文档之间没有使用数组,也没有逗号分隔符。

I discover, recently, that this complies with the JSON Lines textformat.

我最近发现,这符合the JSON Lines text格式。

Like one used in apache.spark.sql.DataFrameReader.json()method.

就像apache.spark.sql.DataFrameReader.json()方法中使用的一样。

回答by ALAN WARD

Perhaps the following reference from the MongoDB project blog could help you gain insight on how arrays work in Mongo:

也许来自 MongoDB 项目博客的以下参考可以帮助您深入了解数组在 Mongo 中的工作方式:

https://blog.mlab.com/2013/04/thinking-about-arrays-in-mongodb/

https://blog.mlab.com/2013/04/thinking-about-arrays-in-mongodb/

I would frame your import otherwise, and either:

否则,我会框定您的导入,或者:

a) import the three different objects separately into the collection as you say, using the --jsonArray flag; or

a) 如您所说,使用 --jsonArray 标志将三个不同的对象分别导入到集合中;或者

b) encapsulate the complete array within a single object, for example in this way:

b) 将完整数组封装在单个对象中,例如以这种方式:

{
"mydata": 
    [
    {
          "project": "project_1",
          ...
          "priority": 7
    }
    ]
}

HTH.

哈。

回答by Tomas

I faced opposite problem today, my conclusion would be:

我今天遇到了相反的问题,我的结论是:

If you wish to insert array of JSON objects at once, where each array entry shall be treated as separate dtabase entry, you have two options of syntax:

如果您希望一次插入 JSON 对象数组,其中每个数组条目应被视为单独的 dtabase 条目,您有两种语法选项:

  1. Array of object with valid coma positions & --jsonArray flag obligatory

    [
      {obj1},
      {obj2},
      {obj3}
    ]
    
  2. Use file with basically incorrect JSON formatting (i.e. missing ,between JSON object instances & without --jsonArray flag

    {obj1}
    {obj2}
    {obj3}
    
  1. 具有有效昏迷位置和 --jsonArray 标志的对象数组

    [
      {obj1},
      {obj2},
      {obj3}
    ]
    
  2. 使用基本上不正确的 JSON 格式的文件(即,在 JSON 对象实例之间丢失且没有 --jsonArray 标志

    {obj1}
    {obj2}
    {obj3}
    

If you wish to insert only an array (i.e. array as top-level citizen of your database) I think it's not possible and not valid, because mongoDB by definition supports documents as top-level objects which are mapped to JSON objects afterwards. In other words, you must wrap your array into JSON object as ALAN WARD pointed out.

如果您只想插入一个数组(即数组作为数据库的顶级公民),我认为这是不可能且无效的,因为 mongoDB 根据定义支持将文档作为顶级对象,然后映射到 JSON 对象。换句话说,您必须像 ALAN WARD 指出的那样将数组包装到 JSON 对象中。