Python 防止 pandas read_csv 将第一行视为列名的标题

Question

提问by R.M.

I'm reading in a pandas DataFrameusing pd.read_csv. I want to keep the first row as data, however it keeps getting converted to column names.

我正在阅读pandas DataFrameusing pd.read_csv。我想将第一行保留为数据，但它不断转换为列名。

I tried header=Falsebut this just deleted it entirely.

我试过了，header=False但这只是完全删除了它。

(Note on my input data: I have a string (st = '\n'.join(lst)) that I convert to a file-like object (io.StringIO(st)), then build the csvfrom that file object.)

（注意我的输入数据：我有一个字符串 ( st = '\n'.join(lst))，我将它转换为类似文件的对象 ( io.StringIO(st))，然后csv从该文件对象构建。）

Answer 1

回答by EdChum

You want header=Nonethe Falsegets type promoted to intinto 0see the docsemphasis mine:

您希望header=None将Falseget 类型提升int为0查看我的文档重点：

header : int or list of ints, default ‘infer' Row number(s) to use as the column names, and the start of the data. Default behavior is as if set to 0 if no names passed, otherwise None. Explicitly pass header=0 to be able to replace existing names. The header can be a list of integers that specify row locations for a multi-index on the columns e.g. [0,1,3]. Intervening rows that are not specified will be skipped (e.g. 2 in this example is skipped). Note that this parameter ignores commented lines and empty lines if skip_blank_lines=True, so header=0 denotes the first line of data rather than the first line of the file.

header : int 或 int 列表，默认“推断”行号用作列名，以及数据的开头。如果没有传递名称，默认行为就像设置为 0 一样，否则设置为None。显式传递 header=0 以能够替换现有名称。标题可以是一个整数列表，用于指定列上多索引的行位置，例如 [0,1,3]。未指定的中间行将被跳过（例如，本例中的 2 被跳过）。请注意，如果skip_blank_lines=True，此参数将忽略注释行和空行，因此header=0 表示数据的第一行而不是文件的第一行。

You can see the difference in behaviour, first with header=0:

您可以看到行为的差异，首先是header=0：

In [95]:
import io
import pandas as pd
t="""a,b,c
0,1,2
3,4,5"""
pd.read_csv(io.StringIO(t), header=0)

Out[95]:
   a  b  c
0  0  1  2
1  3  4  5

Now with None:

现在None：

In [96]:
pd.read_csv(io.StringIO(t), header=None)

Out[96]:
   0  1  2
0  a  b  c
1  0  1  2
2  3  4  5

Note that in latest version 0.19.1, this will now raise a TypeError:

请注意，在最新版本中0.19.1，这将引发TypeError：

In [98]:
pd.read_csv(io.StringIO(t), header=False)

TypeError: Passing a bool to header is invalid. Use header=None for no header or header=int or list-like of ints to specify the row(s) making up the column names

类型错误：将布尔值传递给标头无效。使用 header=None 表示没有标题或 header=int 或类似 int 的列表来指定构成列名称的行

Answer 2

回答by jezrael

I think you need parameter header=Noneto read_csv:

我想你需要参数header=None到read_csv：

Sample:

样本：

import pandas as pd
from pandas.compat import StringIO

temp=u"""a,b
2,1
1,1"""

df = pd.read_csv(StringIO(temp),header=None)
print (df)
   0  1
0  a  b
1  2  1
2  1  1

Python 防止 pandas read_csv 将第一行视为列名的标题

提问by R.M.

回答by EdChum

回答by jezrael

相关推荐

最近更新

标签

Python 防止 pandas read_csv 将第一行视为列名的标题

提问by R.M.

回答by EdChum

回答by jezrael

相关推荐

Python 如何抑制 py.test 内部弃用警告

Python 为什么我从 grangercausalitytests 得到“LinAlgError: Singular matrix”？

使用 Python 下载 YouTube 视频到某个目录

Python UnicodeDecodeError，utf-8 无效的继续字节

相关推荐

最近更新

标签