pandas.errors.EmptyDataError:没有要从文件中解析的列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/50333067/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
pandas.errors.EmptyDataError: No columns to parse from file
提问by ubuntu_noob
I have created a list datatype which has the path of three folders where each folder has a lot of .txt files. I am trying to work with each file in the folder by making it a pandas dataframe but I am getting the error as listed.
我创建了一个列表数据类型,它具有三个文件夹的路径,其中每个文件夹都有很多 .txt 文件。我试图通过将文件夹中的每个文件设置为Pandas数据框来处理文件夹中的每个文件,但我收到了列出的错误。
CODE-
代码-
for l in list:
for root, dirs, files in os.walk(l, topdown=False):
for name in files:
#print(os.path.join(root, name))
df = pd.read_csv(os.path.join(root, name))
ERROR-
错误-
Traceback (most recent call last):
File "feature_drebin.py", line 18, in <module>
df = pd.read_csv(os.path.join(root, name))
File "E:\anaconda\lib\site-packages\pandas\io\parsers.py", line 709, in parser_f
return _read(filepath_or_buffer, kwds)
File "E:\anaconda\lib\site-packages\pandas\io\parsers.py", line 449, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "E:\anaconda\lib\site-packages\pandas\io\parsers.py", line 818, in __init__
self._make_engine(self.engine)
File "E:\anaconda\lib\site-packages\pandas\io\parsers.py", line 1049, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "E:\anaconda\lib\site-packages\pandas\io\parsers.py", line 1695, in __init__
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 565, in pandas._libs.parsers.TextReader.__cinit__
pandas.errors.EmptyDataError: No columns to parse from file
.txt file
.txt 文件
回答by Phil W
I had the same problem and the answer was above: "This error will also come if you are reading the csv which you have just created"
我遇到了同样的问题,答案在上面:“如果您正在阅读刚刚创建的 csv,也会出现此错误”
I have a rubbishy csv file created elsewhere where I have no control. The file starts with two meaningless (at least useless to me) lines, two blank lines, then the data with column headings of phrases rather than words. i.e. column headings each with multiple words with spaces. To anyone with a data background, that is a big NO. If you have column headings with spaces in them you are asking for problems; always use single words.
我在我无法控制的其他地方创建了一个垃圾 csv 文件。该文件以两行无意义(至少对我没用)、两行空行开头,然后是带有短语而不是单词的列标题的数据。即列标题每个都有多个带有空格的单词。对于任何有数据背景的人来说,这是一个很大的问题。如果您的列标题中有空格,那么您就是在问问题;始终使用单个单词。
My plan for this csv was to open it, delete the first five rows and write the remaining lines to a newly created csv to which I had already written the new heading line. The problem was, when I tried to open the dataframe, pandas threw the 'empty data error'.
我对这个 csv 的计划是打开它,删除前五行并将剩余的行写入一个新创建的 csv,我已经在其中写入了新的标题行。问题是,当我尝试打开数据框时,pandas 抛出了“空数据错误”。
Examination of the source and target files showed them to be perfect, could be opened in Notepad or Excel and all the answers I could find referred to checking file paths, delimiters, encoding, etc.
对源文件和目标文件的检查表明它们是完美的,可以在记事本或 Excel 中打开,我能找到的所有答案都涉及检查文件路径、分隔符、编码等。
It seems to me that python doesn't follow our line-by-line instructions but goes off to do other bits while earlier instructions have not yet been completed - multitasking. To prove my point, I commented out the lines to write to the new file (it had already been created on a previous run) and the df came up prefectly.
在我看来,python 并没有遵循我们的逐行指令,而是在早期指令尚未完成时开始执行其他操作——多任务处理。为了证明我的观点,我注释掉了写入新文件的行(它已经在上一次运行中创建了)并且 df 完美地出现了。
回答by Rishav Sharma
This error will also come if you are reading the csv which you have just created. The solution to this is try creating another thread that will call another function to read csv and perform other operation. The below code will work when you have to merge multiple csv files into one excel file
如果您正在阅读刚刚创建的 csv,也会出现此错误。解决方案是尝试创建另一个线程,该线程将调用另一个函数来读取 csv 并执行其他操作。当您必须将多个 csv 文件合并为一个 excel 文件时,以下代码将起作用
t4= threading.Thread(function_name)
t4.start()
def function_name():
lock.acquire()
writi = ExcelWriter('./Final.xlsx')
stock = glob.glob("./*.csv")
df_file = (pd.read_csv(g) for g in stock)
for inn, di in enumerate(df_file):
di.to_excel(writi, sheet_name='view{}.csv'.format(inn)
writi.save()
lock.release()
回答by Logan
If you are trying to read .txt files into a Pandas Dataframe you would need to have the sep = " " tag.
如果您尝试将 .txt 文件读入 Pandas 数据帧,则需要使用 sep = " " 标签。
This will tell Pandas to use a space as the delimiter instead of the standard comma.
这将告诉 Pandas 使用空格作为分隔符而不是标准逗号。
Also, you if you are importing from a text file and have no column names in the data, you should pass the header=None attribute. Your definition would look like this then:
此外,如果您从文本文件导入并且数据中没有列名,则应传递 header=None 属性。您的定义将如下所示:
df = pd.read_csv('output_list.txt', sep=" ", header=None)