pandas 读取熊猫中除最后一行之外的所有 CSV 文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/33689694/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:13:25  来源:igfitidea点击:

Read all but last line of CSV file in pandas

pythonpandasdataframe

提问by eleanora

I have CSV files which I read in in pandas with:

我有我在 Pandas 中读入的 CSV 文件:

#!/usr/bin/env python

import pandas as pd
import sys

filename = sys.argv[1]
df = pd.read_csv(filename)

Unfortunately, the last line of these files is often corrupt (has the wrong number of commas). Currently I open each file in a text editor and remove the last line.

不幸的是,这些文件的最后一行经常损坏(逗号数错误)。目前我在文本编辑器中打开每个文件并删除最后一行。

Is it possible to remove the last line in the same python/pandas script that loads the CSV to save having to take this extra non-automated step?

是否可以删除加载 CSV 的同一个 python/pandas 脚本中的最后一行,以节省必须采取这个额外的非自动化步骤?

回答by EdChum

pass error_bad_lines=Falseand it will skip this line automatically

通过error_bad_lines=False,它会自动跳过这一行

df = pd.read_csv(filename, error_bad_lines=False)

The advantage of error_bad_linesis it will skip and not bork on any erroneous lines but if the last line is always duff then skipfooter=1is better

的优点error_bad_lines是它会跳过而不是在任何错误的行上 bork 但如果最后一行总是 duff 那么skipfooter=1更好

Thanks to @DexterMorgan for pointing out that skipfooteroption forces the engine to use the python engine which is slower than the c engine for parsing a csv.

感谢@DexterMorgan 指出该skipfooter选项强制引擎使用比 c 引擎慢的 python 引擎来解析 csv。

回答by Mangu Singh Rajpurohit

Read http://pandas.pydata.org/pandas-docs/version/0.16.2/generated/pandas.read_csv.html. Here 'skipfooter' argument can be used to specify no of lines that you don't want to read from .csv file from the end. May be It may help you.

阅读http://pandas.pydata.org/pandas-docs/version/0.16.2/generated/pandas.read_csv.html。这里 'skipfooter' 参数可用于指定您不想从 .csv 文件中读取的行数。也许它可以帮助你。