pandas 无法访问数据框列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/38894098/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Can't access dataframe columns
提问by drevicko
I'm importing a dataframe from a csv file, but cannot access some of it's columns by name. What's going on?
我正在从 csv 文件导入数据框,但无法按名称访问其中的某些列。这是怎么回事?
In more concrete terms:
更具体地说:
> import pandas
> jobNames = pandas.read_csv("job_names.csv")
> print(jobNames)
job_id job_name num_judgements
0 933985 Foo 180
1 933130 Moo 175
2 933123 Goo 150
3 933094 Flue 120
4 933088 Tru 120
When I try to access the second column, I get an error:
当我尝试访问第二列时,出现错误:
> jobNames.job_name
AttributeError: 'DataFrame' object has no attribute 'job_name'
AttributeError: 'DataFrame' 对象没有属性 'job_name'
Strangely, I can access the job_id column thus:
奇怪的是,我可以这样访问 job_id 列:
> print(jobNames.job_id)
0 933985
1 933130
2 933123
3 933094
4 933088
Name: job_id, dtype: int64
Edit (to put the accepted answer in context):
编辑(将接受的答案放在上下文中):
It turns out that the first row of the csv file (with the column names) looks like this:
事实证明,csv 文件的第一行(带有列名)如下所示:
job_id, job_name, num_judgements
Note the spaces after each comma! Those spaces are retained in the column names:
注意每个逗号后面的空格!这些空格保留在列名中:
> jobNames.columns[1]
' job_name'
which don't form valid python identifiers, so those columns aren't available as dataframe attributes. I can still access them dict-style:
它们不形成有效的 python 标识符,因此这些列不可用作数据框属性。我仍然可以访问它们 dict 风格:
> jobNames[' job_name']
采纳答案by Maxim Egorushkin
When using pandas.read_csv
pass in skipinitialspace=True
flag to remove whitespace after CSV delimiters.
使用pandas.read_csv
传入skipinitialspace=True
标志删除 CSV 分隔符后的空格时。
回答by jezrael
回答by drevicko
Another (perhaps inferior) approach is to remove the spaces from the column names:
另一种(可能是次等的)方法是从列名中删除空格:
> jobNames.columns = map(lambda s:s.strip(), jobNames.columns)
> jobNames.job_name
0 Foo
1 Moo
2 Goo
3 Flue
4 Tru
Name: job_name, dtype: object