pandas ParserError:标记数据时出错。C 错误:第 2624 行预期有 2503 个字段,看到 52523

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/46538726/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:34:13  来源:igfitidea点击:

ParserError: Error tokenizing data. C error: Expected 2503 fields in line 2624, saw 52523

pythonpandasdataframe

提问by Jayashree

I use pandas read_csv function to read my csv file.

我使用 pandas read_csv 函数来读取我的 csv 文件。

feature_file_df_5=pd.read_csv('/home/jayashree/Documents/Nokia/DataSet/SMT Data Analytics/SPI (Solder Paste Inspection)/086990A-108-FHFB-TRX-985676H-BOTTOM-N_0608_2001_2500.csv',header=501)

I am facing parser error

我正面临解析器错误

/home/jayashree/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.pyc in read(self, nrows)
   1717     def read(self, nrows=None):
   1718         try:
-> 1719             data = self._reader.read(nrows)
   1720         except StopIteration:
   1721             if self._first_chunk:

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader.read (pandas/_libs/parsers.c:10862)()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._read_low_memory (pandas/_libs/parsers.c:11138)()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._read_rows (pandas/_libs/parsers.c:11884)()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._tokenize_rows (pandas/_libs/parsers.c:11755)()

pandas/_libs/parsers.pyx in pandas._libs.parsers.raise_parser_error (pandas/_libs/parsers.c:28765)()

ParserError: Error tokenizing data. C error: Expected 2503 fields in line 2624, saw 52523

Based on suggestions from this threadI tried adding sep option as

根据该线程的建议,我尝试将 sep 选项添加为

feature_file_df_5=pd.read_csv('/home/jayashree/Documents/Nokia/DataSet/SMT Data Analytics/SPI (Solder Paste Inspection)/086990A-108-FHFB-TRX-985676H-BOTTOM-N_0608_2001_2500.csv', sep=',',header=501)

STill getting same error when I used sep=None

当我使用 sep=None 时仍然出现相同的错误

`feature_file_df_5=pd.read_csv('/home/jayashree/Documents/Nokia/DataSet/SMT Data Analytics/SPI (Solder Paste Inspection)/086990A-108-FHFB-TRX-985676H-BOTTOM-N_0608_2001_2500.csv', sep=None,header=`501)

I am getting this error

我收到此错误

/home/jayashree/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.pyc in _rows_to_cols(self, content)
   2782                 msg = ('Expected %d fields in line %d, saw %d' %
   2783                        (col_len, row_num + 1, actual_len))
-> 2784                 if len(self.delimiter) > 1 and self.quoting != csv.QUOTE_NONE:
   2785                     # see gh-13374
   2786                     reason = ('Error could possibly be due to quotes being '

TypeError: object of type 'NoneType' has no len()


  [1]: https://stackoverflow.com/questions/18039057/python-pandas-error-tokenizing-data

On opening in spreadsheet,I could not find any problem all rows are present. How to resolve the error.

在电子表格中打开时,我找不到所有行都存在的任何问题。如何解决错误。

回答by Anastasia Manokhina

You should possibly experiment with parameters quotingand quotecharwhich can help with file fields structurizing. More details here: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html

您可能应该尝试使用参数quotingquotechar这有助于文件字段的结构化。更多细节在这里:https: //pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html

Or maybe if there is only one (or few) broken rows which can be omitted, use error_bad_lines=False.

或者,如果只有一个(或几个)可以省略的断行,请使用error_bad_lines=False.