pandas 无法访问数据框列

Question

提问by drevicko

I'm importing a dataframe from a csv file, but cannot access some of it's columns by name. What's going on?

我正在从 csv 文件导入数据框，但无法按名称访问其中的某些列。这是怎么回事？

In more concrete terms:

更具体地说：

> import pandas

> jobNames = pandas.read_csv("job_names.csv")
> print(jobNames)

   job_id   job_name   num_judgements
0  933985        Foo              180
1  933130        Moo              175
2  933123        Goo              150
3  933094       Flue              120
4  933088        Tru              120

When I try to access the second column, I get an error:

当我尝试访问第二列时，出现错误：

> jobNames.job_name

AttributeError: 'DataFrame' object has no attribute 'job_name'

AttributeError: 'DataFrame' 对象没有属性 'job_name'

Strangely, I can access the job_id column thus:

奇怪的是，我可以这样访问 job_id 列：

> print(jobNames.job_id)

0    933985
1    933130
2    933123
3    933094
4    933088
Name: job_id, dtype: int64

Edit (to put the accepted answer in context):

编辑（将接受的答案放在上下文中）：

It turns out that the first row of the csv file (with the column names) looks like this:

事实证明，csv 文件的第一行（带有列名）如下所示：

job_id, job_name, num_judgements

Note the spaces after each comma! Those spaces are retained in the column names:

注意每个逗号后面的空格！这些空格保留在列名中：

> jobNames.columns[1]

' job_name'

which don't form valid python identifiers, so those columns aren't available as dataframe attributes. I can still access them dict-style:

它们不形成有效的 python 标识符，因此这些列不可用作数据框属性。我仍然可以访问它们 dict 风格：

> jobNames[' job_name']

Answer 1

采纳答案by Maxim Egorushkin

When using pandas.read_csvpass in skipinitialspace=Trueflag to remove whitespace after CSV delimiters.

使用pandas.read_csv传入skipinitialspace=True标志删除 CSV 分隔符后的空格时。

Answer 2

回答by jezrael

Another solution for removing whitespaces from column names is str.strip:

从列名中删除空格的另一种解决方案是str.strip：

jobNames.columns = jobNames.columns.str.strip()
print (jobNames.job_name)

0     Foo
1     Moo
2     Goo
3    Flue
4     Tru

Answer 3

回答by drevicko

Another (perhaps inferior) approach is to remove the spaces from the column names:

另一种（可能是次等的）方法是从列名中删除空格：

> jobNames.columns = map(lambda s:s.strip(), jobNames.columns)
> jobNames.job_name

0   Foo
1   Moo
2   Goo
3   Flue
4   Tru
Name: job_name, dtype: object

pandas 无法访问数据框列

提问by drevicko

采纳答案by Maxim Egorushkin

回答by jezrael

回答by drevicko

相关推荐

最近更新

标签

pandas 无法访问数据框列

提问by drevicko

采纳答案by Maxim Egorushkin

回答by jezrael

回答by drevicko

相关推荐

Python/Pandas - 基于多个变量和 if/elif/else 函数创建新变量

pandas 列上的熊猫数据框排序会引发索引上的关键错误

python-pandas：处理熊猫数据帧日期列中的 NaT 类型值

在 Python pandas DataFrame 中将浮点数舍入/近似到小数点后 3 位

相关推荐

最近更新

标签