带有包含空格的列名的 Pandas 列访问

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/13757090/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 00:07:55  来源:igfitidea点击:

Pandas column access w/column names containing spaces

stringpandas

提问by Brad Fair

If I import or create a pandas column that contains no spaces, I can access it as such:

如果我导入或创建一个不包含空格的 Pandas 列,我可以这样访问它:

df1 = DataFrame({'key': ['b', 'b', 'a', 'c', 'a', 'a', 'b'],
                 'data1': range(7)})

df1.data1

which would return that series for me. If, however, that column has a space in its name, it isn't accessible via that method:

这将为我返回该系列。但是,如果该列的名称中有空格,则无法通过该方法访问它:

df2 = DataFrame({'key': ['a','b','d'],
                 'data 2': range(3)})

df2.data 2      # <--- not the droid i'm looking for.

I know I can access it using .xs():

我知道我可以使用 .xs() 访问它:

df2.xs('data 2', axis=1)

There's gotto be another way. I've googled it like mad and can't think of any other way to google it. I've read all 96 entries here on SO that contain "column" and "string" and "pandas" and could find no previous answer. Is this the only way, or is there something better?

还有的是另一种方式。我像疯了一样用谷歌搜索它,想不出任何其他方式来谷歌它。我已经阅读了 SO 上的所有 96 个条目,其中包含“列”、“字符串”和“熊猫”,但找不到以前的答案。这是唯一的方法,还是有更好的方法?

Thanks!

谢谢!

采纳答案by Rutger Kassies

I think thedefault way is to use:

我认为默认的方法是使用:

df1 = pandas.DataFrame({'key': ['b', 'b', 'a', 'c', 'a', 'a', 'b'],
             'dat a1': range(7)})

df1['dat a1']

The other methods, like exposing it as an attribute are more for convenience.

其他方法,例如将其作为属性公开,更多的是为了方便。

回答by AkiRoss

Old post, but may be interesting: an idea (which is destructive, but does the job if you want it quick and dirty) is to rename columns using underscores:

旧帖子,但可能很有趣:一个想法(这是破坏性的,但如果你想要快速和肮脏的工作)是使用下划线重命名列:

df1.columns = [c.replace(' ', '_') for c in df1.columns]

回答by Olsgaard

While the accepted answer works for column-specification when using dictionaries or []-selection, it does not generalise to other situations where one needs to refer to columns, such as the assignmethod:

虽然在使用字典或 [] 选择时接受的答案适用于列规范,但它不能推广到需要引用列的其他情况,例如assign方法:

> df.assign("data 2" = lambda x: x.sum(axis=1)
SyntaxError: keyword can't be an expression

回答by Abuw

If you like to supply spaced columns name to pandas method like assign you can dictionarize your inputs.

如果您想为 pandas 方法提供间隔的列名称,例如assign,您可以将您的输入字典化。

df.assign(**{'space column': (lambda x: x['space column2'])})

回答by Jochen Gebsattel

If you want to apply filtering, that's also possible with column names having spaces in it, e.g. filtering for NULL-values or empty strings:

如果要应用过滤,也可以使用包含空格的列名,例如过滤 NULL 值或空字符串:

df_package[(df_package['Country_Region Code'].notnull()) | 
(df_package['Country_Region Code'] != u'')]

as I figured out thanks to Rutger Kassiesanswer.

多亏了Rutger Kassies 的回答,我才知道。