Pandas - 标记化数据预期 1 个字段看到多个
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/25254908/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas - Tokenizing Data Expected 1 field saw multiple
提问by Ahmed Haque
A bit confused why I am getting this error. I thought skiprows should have taken care of me.
有点困惑为什么我会收到这个错误。我认为skiprows应该照顾我。
Error:
错误:
CParserError: Error tokenizing data. C error: Expected 1 fields in line 6, saw 13
Line:
线:
df_data = pd.read_csv(infile.name, skiprows=[6], sep=',')
CSV:
CSV:
Header: 1asdf
Header: 2fac
Header: 3aaz
Header: 4ssw
Header: 5aaa
0.0,-64,192,152,27023,3,0,26275,31473,149,67,77,0.0
0.04050016403198242,-64,192,148,27021,3,0,26274,31471,149,67,77,0.038919925689697266
0.08100008964538574,-64,192,148,27017,3,0,26275,31467,149,67,77,0.07783985137939453
0.12150001525878906,-60,192,148,27019,3,0,26277,31467,149,67,77,0.1167600154876709
0.16199994087219238,-60,192,144,27015,3,0,26277,31463,149,67,77,0.15567994117736816
0.2025001049041748,-60,192,148,27075,3,0,26319,31463,149,67,77,0.19460034370422363
回答by Python Beginner WG
I got the same error message. In my case it was because commas were used as decimal marks and the cells were seperated by semicolon. In my case sep=";"solved the problem:
pd.read_csv(infile.name, sep=";")
我收到了同样的错误信息。就我而言,这是因为逗号用作小数点,并且单元格用分号分隔。在我的情况下 sep=";"解决了这个问题:
pd.read_csv(infile.name, sep=";")
回答by chrisb
If you pass a list to skiprows, it interprets it as 'skip the rows in this list (0 indexed)'. Pass an integer instead. You probably also want header=Noneso your first row of data doesn't become the column names.
如果您将列表传递给 skiprows,它会将其解释为“跳过此列表中的行(0 索引)”。而是传递一个整数。您可能还希望header=None第一行数据不会成为列名。
pd.read_csv(infile.name, skiprows=6, header=None)
回答by YouAreAwesome
This error comes when you have more columns entries than specified in schema.
当您的列条目多于架构中指定的条目时,就会出现此错误。
That means - In your particulate column you should have delimiter in it.
这意味着 - 在您的微粒列中,您应该有分隔符。
In this way interpreter assumes that new column is coming but in reality we dont have any so the exception is thrown as runtime.
通过这种方式,解释器假定新列即将到来,但实际上我们没有任何列,因此异常作为运行时抛出。
Solution for that:
解决方案:
- Best one is, ask your input source generator to solve this.
Second would be, if you have permission to skip the records then using this -
df = pd.read_csv(file_loc, sep=',', keep_default_na=False)
- 最好的方法是,让您的输入源生成器解决这个问题。
其次是,如果您有权跳过记录,则使用此 -
df = pd.read_csv(file_loc, sep=',', keep_default_na=False)

