带有包含空格的列名的 Pandas 列访问
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/13757090/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas column access w/column names containing spaces
提问by Brad Fair
If I import or create a pandas column that contains no spaces, I can access it as such:
如果我导入或创建一个不包含空格的 Pandas 列,我可以这样访问它:
df1 = DataFrame({'key': ['b', 'b', 'a', 'c', 'a', 'a', 'b'],
'data1': range(7)})
df1.data1
which would return that series for me. If, however, that column has a space in its name, it isn't accessible via that method:
这将为我返回该系列。但是,如果该列的名称中有空格,则无法通过该方法访问它:
df2 = DataFrame({'key': ['a','b','d'],
'data 2': range(3)})
df2.data 2 # <--- not the droid i'm looking for.
I know I can access it using .xs():
我知道我可以使用 .xs() 访问它:
df2.xs('data 2', axis=1)
There's gotto be another way. I've googled it like mad and can't think of any other way to google it. I've read all 96 entries here on SO that contain "column" and "string" and "pandas" and could find no previous answer. Is this the only way, or is there something better?
还有的有是另一种方式。我像疯了一样用谷歌搜索它,想不出任何其他方式来谷歌它。我已经阅读了 SO 上的所有 96 个条目,其中包含“列”、“字符串”和“熊猫”,但找不到以前的答案。这是唯一的方法,还是有更好的方法?
Thanks!
谢谢!
采纳答案by Rutger Kassies
I think thedefault way is to use:
我认为在默认的方法是使用:
df1 = pandas.DataFrame({'key': ['b', 'b', 'a', 'c', 'a', 'a', 'b'],
'dat a1': range(7)})
df1['dat a1']
The other methods, like exposing it as an attribute are more for convenience.
其他方法,例如将其作为属性公开,更多的是为了方便。
回答by AkiRoss
Old post, but may be interesting: an idea (which is destructive, but does the job if you want it quick and dirty) is to rename columns using underscores:
旧帖子,但可能很有趣:一个想法(这是破坏性的,但如果你想要快速和肮脏的工作)是使用下划线重命名列:
df1.columns = [c.replace(' ', '_') for c in df1.columns]
回答by Olsgaard
While the accepted answer works for column-specification when using dictionaries or []-selection, it does not generalise to other situations where one needs to refer to columns, such as the assign
method:
虽然在使用字典或 [] 选择时接受的答案适用于列规范,但它不能推广到需要引用列的其他情况,例如assign
方法:
> df.assign("data 2" = lambda x: x.sum(axis=1)
SyntaxError: keyword can't be an expression
回答by Abuw
If you like to supply spaced columns name to pandas method like assign you can dictionarize your inputs.
如果您想为 pandas 方法提供间隔的列名称,例如assign,您可以将您的输入字典化。
df.assign(**{'space column': (lambda x: x['space column2'])})
回答by Jochen Gebsattel
If you want to apply filtering, that's also possible with column names having spaces in it, e.g. filtering for NULL-values or empty strings:
如果要应用过滤,也可以使用包含空格的列名,例如过滤 NULL 值或空字符串:
df_package[(df_package['Country_Region Code'].notnull()) |
(df_package['Country_Region Code'] != u'')]
as I figured out thanks to Rutger Kassiesanswer.
多亏了Rutger Kassies 的回答,我才知道。