Python 用于推断标题行的`header=True` 0.17 之前的pandas.read_csv 旧行为？

Question

提问by Roman

How did old pre-0.17 versions of pandas read_csv()interpret passing a boolean header=True/Falsefor inferring the header row?

0.17 之前的旧版熊猫如何read_csv()解释传递布尔值header=True/False以推断标题行？

I have CSV data with header:

我有带有标题的 CSV 数据：

col1;col2;col3
1.0;10.0;100.0
2.0;20.0;200.0
3.0;30.0;300.0

If read with `header=True`

如果阅读 `header=True`

i.e. df = pandas.read_csv('test.csv', sep=';', header=True),

即df = pandas.read_csv('test.csv', sep=';', header=True)，

that gives the following data-frame:

这给出了以下数据框：

   1.0  10.0  100.0
0    2    20    200
1    3    30    300

It means that pandas used the second row("row 1") for column names (the names inferred are '1.0', '10.0' and '100.0').

这意味着熊猫使用第二行（“第 1 行”）作为列名（推断的名称是“1.0”、“10.0”和“100.0”）。

whereas if read with `header=False`

而如果阅读 `header=False`

df = pandas.read_csv('test.csv', sep=';', header=False)

gives the following:

给出以下内容：

   col1  col2  col3
0     1    10   100
1     2    20   200
2     3    30   300

Which means that pandas used the first row ("row 0") as header in spite on the fact that I wrote explicitly that there is no header.

这意味着熊猫使用第一行（“第 0 行”）作为标题，尽管我明确写了没有标题。

This behaviour is not intuitive to me. Can somebody explain what is happening?

这种行为对我来说并不直观。有人可以解释发生了什么吗？

Answer 1

回答by EdChum

You are telling pandas what line is your header line, by passing Falsethis evaluates to 0which is why it reads in the first line as the header as expected, when you pass Trueit evaluates to 1so it reads the second line, if you passed Nonethen it thinks there is no header row and will auto generated ordinal values.

您告诉熊猫哪一行是您的标题行，通过将Falsethis 评估为0这就是为什么它按预期读取第一行作为标题的原因，当您传递True它时，它评估为1所以它读取第二行，如果您通过了，None那么它认为没有标题行，将自动生成序数值。

In [17]:    
import io
import pandas as pd
t="""col1;col2;col3
1.0;10.0;100.0
2.0;20.0;200.0
3.0;30.0;300.0"""
print('False:\n', pd.read_csv(io.StringIO(t), sep=';', header=False))
print('\nTrue:\n', pd.read_csv(io.StringIO(t), sep=';', header=True))
print('\nNone:\n', pd.read_csv(io.StringIO(t), sep=';', header=None))

False:
    col1  col2  col3
0     1    10   100
1     2    20   200
2     3    30   300

True:
    1.0  10.0  100.0
0    2    20    200
1    3    30    300

None:
       0     1      2
0  col1  col2   col3
1   1.0  10.0  100.0
2   2.0  20.0  200.0
3   3.0  30.0  300.0

UPDATE

更新

Since version 0.17.0this will now raise a TypeError

从版本开始，0.17.0这将引发一个TypeError

Python 用于推断标题行的`header=True` 0.17 之前的pandas.read_csv 旧行为？

提问by Roman

If read with `header=True`

如果阅读 `header=True`

whereas if read with `header=False`

而如果阅读 `header=False`

回答by EdChum

相关推荐

最近更新

标签

Python 用于推断标题行的`header=True` 0.17 之前的pandas.read_csv 旧行为？

提问by Roman

If read with header=True

如果阅读 header=True

whereas if read with header=False

而如果阅读 header=False

回答by EdChum

相关推荐

Python 无法在 Windows 7 上使用 pyodbc 建立到 sql-server 的连接

Python 如何使用argparse打开文件？

Python 无法在 Atom 中使用 PyGame 打开图像

相当于python中的GOTO

相关推荐

最近更新

标签

If read with `header=True`

如果阅读 `header=True`

whereas if read with `header=False`

而如果阅读 `header=False`