pandas.errors.ParserError:错误可能是由于使用多字符分隔符时忽略了引号
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/53066229/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
pandas.errors.ParserError: Error could possibly be due to quotes being ignored when a multi-char delimiter is used
提问by dark horse
I am getting a ParserError when I am trying to read a csv file using Pandas. Given below is the error and the data set that threw this error.
当我尝试使用 Pandas 读取 csv 文件时出现 ParserError。下面给出了错误和引发此错误的数据集。
pandas.errors.ParserError: Expected 10 fields in line 8, saw 11. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.
Given below is the line 8 that has this error
下面给出的是有这个错误的第 8 行
10/29/18 10:20,85505306, Scott,20181029102023-file.csv, 22.49,-12.18,CITY,,12:15.0,51:00.0,ABCD,9898,320,D231
I am reading the csv using the below command:
我正在使用以下命令读取 csv:
df.to_csv('file.csv'), index = False)
Sample output of the csv file:
csv 文件的示例输出:
File_Received_Time Label1 City FileName Label2 Label3 State Unnamed: 12 cTimestamp dTimestamp Label4 Label5 Label6 Label7 Label8
10/29/18 10:20 56776 Paris file1.csv 29 29 IL 29-10-2018 04:11:11 COL06 620 398 516 451
10/29/18 10:20 46069 Hongkong file2.csv 61 58 VA 29-10-2018 04:03:17 28-10-2018 05:58:00 COL06 576 645 349 374
10/29/18 10:20 47240 Sydney file3.csv 43 42 IL 29-10-2018 04:12:46 COL06 534 2047 56831 372
10/29/18 10:20 47432 NewYork file4.csv 55 61 OH 28-10-2018 09:01:00 COL06 514 2354 640 633
10/29/18 10:20 41794 London file5.csv 39 29 29-10-2018 04:12:46 28-10-2018 09:01:00 COL06 470 2354 56831 550
10/29/18 10:20 49643 LA file6.csv 55 43 TX 29-10-2018 04:05:18 COL06 523 2301 53942 403
10/29/18 10:20 54700 Shangai file7.csv 37 29 AZ 29-10-2018 04:12:15 28-10-2018 12:51:00 COL06 569 2683 53642 538
10/29/18 10:20 37134 Singapore file8.csv 53 62 AZ 29-10-2018 04:09:16 COL06 560 391 54541 542
10/29/18 10:20 51144 Taiwan file9.csv 43 33 TX 29-10-2018 04:12:15 COL06 469 472 458 481
回答by Mayank Porwal
I am able to read the error record you pasted above:
我能够阅读您在上面粘贴的错误记录:
For reading a csv through pandas, use read_csv
:
要通过 Pandas 读取 csv,请使用read_csv
:
I pasted the error record in a csv
:
我将错误记录粘贴到了一个csv
:
mayankp@mayank:~/Documents cat t1.csv
10/29/18 10:20,85505306, Scott,20181029102023-file.csv, 22.49,-12.18,CITY,,12:15.0,51:00.0,ABCD,9898,320,D231
Now, I read this in pandas like below:
现在,我在Pandas中阅读了如下内容:
In [114]: df = pd.read_csv('/home/mayankp/Documents/t1.csv', header=None)
In [115]: df
Out[115]:
0 1 2 3 4 5 6 7 8 9 10 11 12 13
0 10/29/18 10:20 85505306 Scott 20181029102023-file.csv 22.49 -12.18 CITY NaN 12:15.0 51:00.0 ABCD 9898 320 D231
It works fine. Let me know if this helps.
它工作正常。如果这有帮助,请告诉我。
回答by TubasPandas
I have had the same error message. I have removed double quotes from the file and that has solved the problem. I have used the below line in the terminal:
我有同样的错误信息。我已经从文件中删除了双引号,这就解决了问题。我在终端中使用了以下行:
cat merged.csv | tr “”” “o” > merged.tsv
猫合并.csv | tr “”” “o” > 合并.tsv
Hope that it helps.
希望它有帮助。
回答by ryolait
So,
所以,
- You are using
to_csv
instead ofread_csv
. See Mayank Porwal comment & answer. - Your data may not be properly formatted. CSV means Comma Separated Values, so separe them with commas before using
read_csv
(not sure of the dataset you use in your own tests, your question is misleading on that point). - For the core problem, carefully check the number of fields you have on each row. You should have the same number on each row. This may explain why you get that error.
- 您正在使用
to_csv
而不是read_csv
. 请参阅 Mayank Porwal 评论和回答。 - 您的数据格式可能不正确。CSV 表示逗号分隔值,因此在使用前用逗号分隔它们
read_csv
(不确定您在自己的测试中使用的数据集,您的问题在这一点上具有误导性)。 - 对于核心问题,请仔细检查每行的字段数。每行应该有相同的数字。这可以解释为什么您会收到该错误。