如何使用python将csv数据推送到mongodb

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/27416296/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 01:47:09  来源:igfitidea点击:

how to push a csv data to mongodb using python

pythonjsonmongodbcsv

提问by Viswanathan

Trying to push csv data in to mongodb using python.i'm a beginner to python & mongodb..i used the following code

尝试使用 python 将 csv 数据推送到 mongodb。我是 python 和 mongodb 的初学者。我使用了以下代码

import csv
import json
import pandas as pd
import sys, getopt, pprint
from pymongo import MongoClient
#CSV to JSON Conversion
csvfile = open('C://test//final-current.csv', 'r')
jsonfile = open('C://test//6.json', 'a')
reader = csv.DictReader( csvfile )
header= [ "S.No", "Instrument Name", "Buy Price", "Buy Quantity", "Sell Price", "Sell Quantity", "Last Traded Price", "Total Traded Quantity", "Average Traded Price", "Open Price", "High Price", "Low Price", "Close Price", "V" ,"Time"]
#fieldnames=header
output=[]
for each in reader:
    row={}
    for field in header:
        row[field]=each[field]
    output.append(row)

json.dump(output, jsonfile, indent=None, sort_keys=False , encoding="UTF-8")
mongo_client=MongoClient() 
db=mongo_client.october_mug_talk
db.segment.drop()
data=pd.read_csv('C://test//6.json', error_bad_lines=0)
df = pd.DataFrame(data)
records = csv.DictReader(df)
db.segment.insert(records)

but the output is given in this format

但输出是以这种格式给出的

/* 0 */
{
  "_id" : ObjectId("54891c4ffb2a0303b0d43134"),
  "[{\"AverageTradedPrice\":\"0\"" : "BuyPrice:\"349.75\""
}

/* 1 */
{
  "_id" : ObjectId("54891c4ffb2a0303b0d43135"),
  "[{\"AverageTradedPrice\":\"0\"" : "BuyQuantity:\"3000\""
}

/* 2 */
{
  "_id" : ObjectId("54891c4ffb2a0303b0d43136"),
  "[{\"AverageTradedPrice\":\"0\"" : "ClosePrice:\"350\""
}

/* 3 */
{
  "_id" : ObjectId("54891c4ffb2a0303b0d43137"),
  "[{\"AverageTradedPrice\":\"0\"" : "HighPrice:\"0\""
}

Actually i want the output to like for single id all the other fields should be showed as subtypes eg:

实际上,我希望输出喜欢单个 ID,所有其他字段都应显示为子类型,例如:

 _id" : ObjectId("54891c4ffb2a0303b0d43137")
    AveragetradedPrice :0
    HighPrice:0
    ClosePrice:350
    buyprice:350.75

Please help me Out.Thanks in advance

请帮帮我。提前致谢

采纳答案by Viswanathan

Thank you for the suggestion.This one is the corrected code:

谢谢你的建议。这是更正后的代码:

import csv
import json
import pandas as pd
import sys, getopt, pprint
from pymongo import MongoClient
#CSV to JSON Conversion
csvfile = open('C://test//final-current.csv', 'r')
reader = csv.DictReader( csvfile )
mongo_client=MongoClient() 
db=mongo_client.october_mug_talk
db.segment.drop()
header= [ "S No", "Instrument Name", "Buy Price", "Buy Quantity", "Sell Price", "Sell Quantity", "Last Traded Price", "Total Traded Quantity", "Average Traded Price", "Open Price", "High Price", "Low Price", "Close Price", "V" ,"Time"]

for each in reader:
    row={}
    for field in header:
        row[field]=each[field]

    db.segment.insert(row)

回答by deenaik

There is a better way with less number of imports, assuming you have a header row in your CSV.

假设您的 CSV 中有标题行,则有一种更好的方法可以减少导入次数。

from pymongo import MongoClient
import csv

# DB connectivity
client = MongoClient('localhost', 27017)
db = client.db
collection = db.collection

# Function to parse csv to dictionary
def csv_to_dict():
    reader = csv.DictReader(open(FILEPATH))
    result = {}
    for row in reader:
        key = row.pop('First_value')
        result[key] = row
    return query

# Final insert statement
db.collection.insert_one(csv_to_dict())

Hope that helps

希望有帮助

回答by Adil

The easiest way is by using pandas my code is

最简单的方法是使用熊猫我的代码是

import json
import pymongo
import pandas as pd
myclient = pymongo.MongoClient()

df = pd.read_csv('yourcsv.csv',encoding = 'ISO-8859-1')   # loading csv file
df.to_json('yourjson.json')                               # saving to json file
jdf = open('yourjson.json').read()                        # loading the json file 
data = json.loads(jdf)                                    # reading json file 

now you can insert this json in your mangodb database :-]

现在你可以在你的 mangodb 数据库中插入这个 json :-]

回答by Perfect

Why do you insert data one by one? Take a look at this one.

为什么要一一插入数据?看看这个。

import pandas as pd
from pymongo import MongoClient

client = MongoClient(<your_credentials>)
database = client['YOUR_DB_NAME']
collection = database['your_collection']

def csv_to_json(filename, header=None):
    data = pd.read_csv(filename, header=header)
    return data.to_dict('records')

collection.insert_many(csv_to_json('your_file_path'))

Please be aware of that it might crash your app when the file is too big.

请注意,当文件太大时,它可能会导致您的应用程序崩溃。