pandas ParserError:标记数据时出错。C 错误:第 2624 行预期有 2503 个字段,看到 52523
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/46538726/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
ParserError: Error tokenizing data. C error: Expected 2503 fields in line 2624, saw 52523
提问by Jayashree
I use pandas read_csv function to read my csv file.
我使用 pandas read_csv 函数来读取我的 csv 文件。
feature_file_df_5=pd.read_csv('/home/jayashree/Documents/Nokia/DataSet/SMT Data Analytics/SPI (Solder Paste Inspection)/086990A-108-FHFB-TRX-985676H-BOTTOM-N_0608_2001_2500.csv',header=501)
I am facing parser error
我正面临解析器错误
/home/jayashree/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.pyc in read(self, nrows)
1717 def read(self, nrows=None):
1718 try:
-> 1719 data = self._reader.read(nrows)
1720 except StopIteration:
1721 if self._first_chunk:
pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader.read (pandas/_libs/parsers.c:10862)()
pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._read_low_memory (pandas/_libs/parsers.c:11138)()
pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._read_rows (pandas/_libs/parsers.c:11884)()
pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._tokenize_rows (pandas/_libs/parsers.c:11755)()
pandas/_libs/parsers.pyx in pandas._libs.parsers.raise_parser_error (pandas/_libs/parsers.c:28765)()
ParserError: Error tokenizing data. C error: Expected 2503 fields in line 2624, saw 52523
Based on suggestions from this threadI tried adding sep option as
根据该线程的建议,我尝试将 sep 选项添加为
feature_file_df_5=pd.read_csv('/home/jayashree/Documents/Nokia/DataSet/SMT Data Analytics/SPI (Solder Paste Inspection)/086990A-108-FHFB-TRX-985676H-BOTTOM-N_0608_2001_2500.csv', sep=',',header=501)
STill getting same error when I used sep=None
当我使用 sep=None 时仍然出现相同的错误
`feature_file_df_5=pd.read_csv('/home/jayashree/Documents/Nokia/DataSet/SMT Data Analytics/SPI (Solder Paste Inspection)/086990A-108-FHFB-TRX-985676H-BOTTOM-N_0608_2001_2500.csv', sep=None,header=`501)
I am getting this error
我收到此错误
/home/jayashree/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.pyc in _rows_to_cols(self, content)
2782 msg = ('Expected %d fields in line %d, saw %d' %
2783 (col_len, row_num + 1, actual_len))
-> 2784 if len(self.delimiter) > 1 and self.quoting != csv.QUOTE_NONE:
2785 # see gh-13374
2786 reason = ('Error could possibly be due to quotes being '
TypeError: object of type 'NoneType' has no len()
[1]: https://stackoverflow.com/questions/18039057/python-pandas-error-tokenizing-data
On opening in spreadsheet,I could not find any problem all rows are present. How to resolve the error.
在电子表格中打开时,我找不到所有行都存在的任何问题。如何解决错误。
回答by Anastasia Manokhina
You should possibly experiment with parameters quoting
and quotechar
which can help with file fields structurizing.
More details here:
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html
您可能应该尝试使用参数quoting
,quotechar
这有助于文件字段的结构化。更多细节在这里:https:
//pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html
Or maybe if there is only one (or few) broken rows which can be omitted, use error_bad_lines=False
.
或者,如果只有一个(或几个)可以省略的断行,请使用error_bad_lines=False
.