Python 从 CSV 中删除空行?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4521426/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 16:06:21  来源:igfitidea点击:

Delete blank rows from CSV?

pythoncsvdelete-row

提问by debugged

I have a large csv file in which some rows are entirely blank. How do I use Python to delete all blank rows from the csv?

我有一个很大的 csv 文件,其中一些行完全是空白的。如何使用 Python 从 csv 中删除所有空白行?

After all your suggestions, this is what I have so far

在您提出所有建议之后,这就是我目前所拥有的

import csv

# open input csv for reading
inputCSV = open(r'C:\input.csv', 'rb')

# create output csv for writing
outputCSV = open(r'C:\OUTPUT.csv', 'wb')

# prepare output csv for appending
appendCSV = open(r'C:\OUTPUT.csv', 'ab')

# create reader object
cr = csv.reader(inputCSV, dialect = 'excel')

# create writer object
cw = csv.writer(outputCSV, dialect = 'excel')

# create writer object for append
ca = csv.writer(appendCSV, dialect = 'excel')

# add pre-defined fields
cw.writerow(['FIELD1_','FIELD2_','FIELD3_','FIELD4_'])

# delete existing field names in input CSV
# ???????????????????????????

# loop through input csv, check for blanks, and write all changes to append csv
for row in cr:
    if row or any(row) or any(field.strip() for field in row):
        ca.writerow(row)

# close files
inputCSV.close()
outputCSV.close()
appendCSV.close()

Is this ok or is there a better way to do this?

这可以还是有更好的方法来做到这一点?

采纳答案by Laurence Gonsalves

Use the csvmodule:

使用csv模块:

import csv
...

with open(in_fnam) as in_file:
    with open(out_fnam, 'w') as out_file:
        writer = csv.writer(out_file)
        for row in csv.reader(in_file):
            if row:
                writer.writerow(row)

If you also need to remove rows where all of the fields are empty, change the if row:line to:

如果您还需要删除所有字段都为空的行,请将行更改if row:为:

if any(row):

And if you also want to treat fields that consist of only whitespace as empty you can replace it with:

如果您还想将仅包含空格的字段视为空字段,您可以将其替换为:

if any(field.strip() for field in row):


Note that in Python 2.x and earlier, the csvmodule expected binary files, and so you'd need to open your files with e 'b'flag. In 3.x, doing this will result in an error.

请注意,在 Python 2.x 及更早版本中,csv模块需要二进制文件,因此您需要使用 e'b'标志打开文件。在 3.x 中,这样做会导致错误。

回答by Paulo Scardine

You have to open a second file, write all non blank lines to it, delete the original file and rename the second file to the original name.

您必须打开第二个文件,将所有非空行写入其中,删除原始文件并将第二个文件重命名为原始名称。

EDIT: a real blank line will be like '\n':

编辑:一个真正的空行会像'\n':

for line in f1.readlines():
    if line.strip() == '':
        continue
    f2.write(line)

a line with all blank fields would look like ',,,,,\n'. If you consider this a blank line:

包含所有空白字段的行看起来像 ',,,,,,\n'。如果您认为这是一个空行:

for line in f1.readlines():
    if ''.join(line.split(',')).strip() == '':
        continue
    f2.write(line)

openning, closing, deleting and renaming the files is left as an exercise for you. (hint: import os, help(open), help(os.rename), help(os.unlink))

打开、关闭、删除和重命名文件留给您作为练习。(提示:导入 os、help(open)、help(os.rename)、help(os.unlink))

EDIT2: Laurence Gonsalves brought to my attention that a valid csv file could have blank lines embedded in quoted csv fields, like 1, 'this\n\nis tricky',123.45. In this case the csv module will take care of that for you. I'm sorry Laurence, your answer deserved to be accepted. The csv module will also address the concerns about a line like "","",""\n.

EDIT2:Laurence Gonsalves 引起我的注意,有效的 csv 文件可能在引用的 csv 字段中嵌入空行,例如1, 'this\n\nis tricky',123.45. 在这种情况下, csv 模块将为您处理。对不起,劳伦斯,你的回答值得被接受。csv 模块还将解决像"","",""\n.

回答by Mariano Ruiz

In this script all the CR / CRLF are removed from a CSV file then has lines like this:

在此脚本中,所有 CR / CRLF 都从 CSV 文件中删除,然后有如下几行:

"My name";[email protected];"This is a comment.
Thanks!"

Execute the script https://github.com/eoconsulting/lr2excelcsv/blob/master/lr2excelcsv.py

执行脚本https://github.com/eoconsulting/lr2excelcsv/blob/master/lr2excelcsv.py

Result (in Excel CSV format):

结果(Excel CSV 格式):

"My name",[email protected],"This is a comment. Thanks!"

回答by vaibhav

python code for remove blank line from csv file without create another file.

用于从 csv 文件中删除空行而不创建另一个文件的 python 代码。

def ReadWriteconfig_file(file):

def ReadWriteconfig_file(file):

try:
    file_object = open(file, 'r')
    lines = csv.reader(file_object, delimiter=',', quotechar='"')
    flag = 0
    data=[]
    for line in lines:
        if line == []:
            flag =1
            continue
        else:
            data.append(line)
    file_object.close()
    if flag ==1: #if blank line is present in file
        file_object = open(file, 'w')
        for line in data:
            str1 = ','.join(line)
            file_object.write(str1+"\n")
        file_object.close() 
except Exception,e:
    print e

回答by Sagun Shrestha

Surprised that nobody here mentioned pandas. Here is a possible solution.

很惊讶这里没有人提到pandas。这是一个可能的解决方案。

import pandas as pd
df = pd.read_csv('input.csv')
df.to_csv('output.csv', index=False)

回答by Gordon Dennis

I need to do this but not have a blank row written at the end of the CSV file like this code unfortunately does (which is also what Excel does if you Save-> .csv). My (even simpler) code using the CSV module does this too:

我需要这样做,但不幸的是,没有像这段代码那样在 CSV 文件的末尾写一个空白行(如果你保存-> .csv,这也是 Excel 所做的)。我使用 CSV 模块的(甚至更简单的)代码也这样做:

import csv

input = open("M51_csv_proc.csv", 'rb')
output = open("dumpFile.csv", 'wb')
writer = csv.writer(output)
for row in csv.reader(input):
    writer.writerow(row)
input.close()
output.close() 

M51_csv_proc.csv has exactly 125 rows; the program always outputs 126 rows, the last one being blank.

M51_csv_proc.csv 正好有 125 行;程序总是输出 126 行,最后一行是空白的。

I've been through all these threads any nothing seems to change this behaviour.

我已经通过所有这些线程似乎没有任何改变这种行为。

回答by Hamza Tayyab

Doing it with pandas is very simple. Open your csv file with pandas:

用熊猫做这件事很简单。用熊猫打开你的 csv 文件:

import pandas as pd
df = pd.read_csv("example.csv")
#checking the number of empty rows in th csv file
print (df.isnull().sum())
#Droping the empty rows
modifiedDF = df.dropna()
#Saving it to the csv file 
modifiedDF.to_csv('modifiedExample.csv',index=False)

回答by Aizayousaf

Here is a solution using pandas that removes blank rows.

这是使用熊猫删除空白行的解决方案。

 import pandas as pd
 df = pd.read_csv('input.csv')
 df.dropna(axis=0, how='all',inplace=True)
 df.to_csv('output.csv', index=False)