Python 如何将 CSV 文件转换为多行 JSON？

Question

提问by BeanBagKing

Here's my code, really simple stuff...

这是我的代码，非常简单的东西......

import csv
import json

csvfile = open('file.csv', 'r')
jsonfile = open('file.json', 'w')

fieldnames = ("FirstName","LastName","IDNumber","Message")
reader = csv.DictReader( csvfile, fieldnames)
out = json.dumps( [ row for row in reader ] )
jsonfile.write(out)

Declare some field names, the reader uses CSV to read the file, and the filed names to dump the file to a JSON format. Here's the problem...

声明一些字段名称，读取器使用 CSV 读取文件，并使用字段名称将文件转储为 JSON 格式。问题来了……

Each record in the CSV file is on a different row. I want the JSON output to be the same way. The problem is it dumps it all on one giant, long line.

CSV 文件中的每条记录都位于不同的行上。我希望 JSON 输出是相同的方式。问题是它把它全部倾倒在一条巨大的长线上。

I've tried using something like for line in csvfile:and then running my code below that with reader = csv.DictReader( line, fieldnames)which loops through each line, but it does the entire file on one line, then loops through the entire file on another line... continues until it runs out of lines.

我试过使用类似的东西for line in csvfile:，然后在下面运行我的代码，reader = csv.DictReader( line, fieldnames)它循环遍历每一行，但它在一行上执行整个文件，然后在另一行上循环整个文件......继续直到它用完行.

Any suggestions for correcting this?

有什么建议可以纠正这个问题吗？

Edit: To clarify, currently I have: (every record on line 1)

编辑：澄清一下，目前我有：（第 1 行的每条记录）

[{"FirstName":"John","LastName":"Doe","IDNumber":"123","Message":"None"},{"FirstName":"George","LastName":"Washington","IDNumber":"001","Message":"Something"}]

What I'm looking for: (2 records on 2 lines)

我在找什么：（2 行 2 条记录）

{"FirstName":"John","LastName":"Doe","IDNumber":"123","Message":"None"}
{"FirstName":"George","LastName":"Washington","IDNumber":"001","Message":"Something"}

Not each individual field indented/on a separate line, but each record on it's own line.

不是每个单独的字段都缩进/在单独的行上，而是每个记录都在它自己的行上。

Some sample input.

一些示例输入。

"John","Doe","001","Message1"
"George","Washington","002","Message2"

Answer 1

采纳答案by SingleNegationElimination

The problem with your desired output is that it is not valid json document,; it's a stream of json documents!

您想要的输出的问题是它不是有效的 json 文档；这是一个json文档流！

That's okay, if its what you need, but that means that for each document you want in your output, you'll have to call json.dumps.

没关系，如果它是您需要的，但这意味着对于您想要在输出中的每个文档，您必须调用json.dumps.

Since the newline you want separating your documents is not contained in those documents, you're on the hook for supplying it yourself. So we just need to pull the loop out of the call to json.dump and interpose newlines for each document written.

由于您想要分隔文档的换行符不包含在这些文档中，因此您需要自己提供换行符。所以我们只需要将循环从对 json.dump 的调用中拉出来，并为每个写入的文档插入换行符。

import csv
import json

csvfile = open('file.csv', 'r')
jsonfile = open('file.json', 'w')

fieldnames = ("FirstName","LastName","IDNumber","Message")
reader = csv.DictReader( csvfile, fieldnames)
for row in reader:
    json.dump(row, jsonfile)
    jsonfile.write('\n')

Answer 2

回答by Wayne Werner

Add the indentparameter to json.dumps

将indent参数添加到json.dumps

 data = {'this': ['has', 'some', 'things'],
         'in': {'it': 'with', 'some': 'more'}}
 print(json.dumps(data, indent=4))

Also note that, you can simply use json.dumpwith the open jsonfile:

另请注意，您可以简单地使用json.dumpopen jsonfile：

json.dump(data, jsonfile)

Answer 3

回答by MONTYHS

import csv
import json
csvfile = csv.DictReader('filename.csv', 'r'))
output =[]
for each in csvfile:
    row ={}
    row['FirstName'] = each['FirstName']
    row['LastName']  = each['LastName']
    row['IDNumber']  = each ['IDNumber']
    row['Message']   = each['Message']
    output.append(row)
json.dump(output,open('filename.json','w'),indent=4,sort_keys=False)

Answer 4

回答by GarciadelCastillo

As slight improvement to @MONTYHS answer, iterating through a tup of fieldnames:

作为对@MONTYHS 答案的轻微改进，遍历一组字段名：

import csv
import json

csvfilename = 'filename.csv'
jsonfilename = csvfilename.split('.')[0] + '.json'
csvfile = open(csvfilename, 'r')
jsonfile = open(jsonfilename, 'w')
reader = csv.DictReader(csvfile)

fieldnames = ('FirstName', 'LastName', 'IDNumber', 'Message')

output = []

for each in reader:
  row = {}
  for field in fieldnames:
    row[field] = each[field]
output.append(row)

json.dump(output, jsonfile, indent=2, sort_keys=True)

Answer 5

回答by Snork S

You can try this

你可以试试这个

import csvmapper

# how does the object look
mapper = csvmapper.DictMapper([ 
  [ 
     { 'name' : 'FirstName'},
     { 'name' : 'LastName' },
     { 'name' : 'IDNumber', 'type':'int' },
     { 'name' : 'Messages' }
  ]
 ])

# parser instance
parser = csvmapper.CSVParser('sample.csv', mapper)
# conversion service
converter = csvmapper.JSONConverter(parser)

print converter.doConvert(pretty=True)

Edit:

编辑：

Simpler approach

更简单的方法

import csvmapper

fields = ('FirstName', 'LastName', 'IDNumber', 'Messages')
parser = CSVParser('sample.csv', csvmapper.FieldMapper(fields))

converter = csvmapper.JSONConverter(parser)

print converter.doConvert(pretty=True)

Answer 6

回答by Lawrence I. Siden

I took @SingleNegationElimination's response and simplified it into a three-liner that can be used in a pipeline:

我采用了@SingleNegationElimination 的响应并将其简化为可在管道中使用的三行：

import csv
import json
import sys

for row in csv.DictReader(sys.stdin):
    json.dump(row, sys.stdout)
    sys.stdout.write('\n')

Answer 7

回答by impiyush

How about using Pandas to read the csv file into a DataFrame (pd.read_csv), then manipulating the columns if you want (dropping them or updating values) and finally converting the DataFrame back to JSON (pd.DataFrame.to_json).

如何使用 Pandas 将 csv 文件读入 DataFrame ( pd.read_csv)，然后根据需要操作列（删除它们或更新值），最后将 DataFrame 转换回 JSON ( pd.DataFrame.to_json)。

Note:I haven't checked how efficient this will be but this is definitely one of the easiest ways to manipulate and convert a large csv to json.

注意：我还没有检查这会有多高效，但这绝对是操作大型 csv 并将其转换为 json 的最简单方法之一。

Answer 8

回答by Mark Channing

I see this is old but I needed the code from SingleNegationElimination however I had issue with the data containing non utf-8 characters. These appeared in fields I was not overly concerned with so I chose to ignore them. However that took some effort. I am new to python so with some trial and error I got it to work. The code is a copy of SingleNegationElimination with the extra handling of utf-8. I tried to do it with https://docs.python.org/2.7/library/csv.htmlbut in the end gave up. The below code worked.

我看到这是旧的，但我需要来自 SingleNegationElimination 的代码，但是我对包含非 utf-8 字符的数据有问题。这些出现在我不太关心的领域，所以我选择忽略它们。然而，这需要一些努力。我是 python 的新手，所以经过一些试验和错误，我让它工作了。该代码是 SingleNegationElimination 的副本，带有 utf-8 的额外处理。我试图用https://docs.python.org/2.7/library/csv.html来做，但最后放弃了。下面的代码有效。

import csv, json

csvfile = open('file.csv', 'r')
jsonfile = open('file.json', 'w')

fieldnames = ("Scope","Comment","OOS Code","In RMF","Code","Status","Name","Sub Code","CAT","LOB","Description","Owner","Manager","Platform Owner")
reader = csv.DictReader(csvfile , fieldnames)

code = ''
for row in reader:
    try:
        print('+' + row['Code'])
        for key in row:
            row[key] = row[key].decode('utf-8', 'ignore').encode('utf-8')      
        json.dump(row, jsonfile)
        jsonfile.write('\n')
    except:
        print('-' + row['Code'])
        raise

Answer 9

回答by Naufal

You can use Pandas DataFrame to achieve this, with the following Example:

您可以使用 Pandas DataFrame 来实现这一点，示例如下：

import pandas as pd
csv_file = pd.DataFrame(pd.read_csv("path/to/file.csv", sep = ",", header = 0, index_col = False))
csv_file.to_json("/path/to/new/file.json", orient = "records", date_format = "epoch", double_precision = 10, force_ascii = True, date_unit = "ms", default_handler = None)

Answer 10

回答by Laxman

import csv
import json

file = 'csv_file_name.csv'
json_file = 'output_file_name.json'

#Read CSV File
def read_CSV(file, json_file):
    csv_rows = []
    with open(file) as csvfile:
        reader = csv.DictReader(csvfile)
        field = reader.fieldnames
        for row in reader:
            csv_rows.extend([{field[i]:row[field[i]] for i in range(len(field))}])
        convert_write_json(csv_rows, json_file)

#Convert csv data into json
def convert_write_json(data, json_file):
    with open(json_file, "w") as f:
        f.write(json.dumps(data, sort_keys=False, indent=4, separators=(',', ': '))) #for pretty
        f.write(json.dumps(data))


read_CSV(file,json_file)

Documentation of json.dumps()

json.dumps() 文档

Python 如何将 CSV 文件转换为多行 JSON？

提问by BeanBagKing

采纳答案by SingleNegationElimination

回答by Wayne Werner

回答by MONTYHS

回答by GarciadelCastillo

回答by Snork S

回答by Lawrence I. Siden

回答by impiyush

回答by Mark Channing

回答by Naufal

回答by Laxman

相关推荐

最近更新

标签

Python 如何将 CSV 文件转换为多行 JSON？

提问by BeanBagKing

采纳答案by SingleNegationElimination

回答by Wayne Werner

回答by MONTYHS

回答by GarciadelCastillo

回答by Snork S

回答by Lawrence I. Siden

回答by impiyush

回答by Mark Channing

回答by Naufal

回答by Laxman

相关推荐

Python 如何在xlwt中编写具有多列的单元格？

Python 将字节转换为整数？

导入pika时python没有模块名称pika

Python JSON 序列化 Mongodb

相关推荐

最近更新

标签