Pandas:如何将函数应用于列名
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/42788311/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas: how to apply function to column names
提问by Denis Kulagin
I would like all columns to be named in a uniform manner, like:
我希望以统一的方式命名所有列,例如:
Last Name -> LAST_NAME
e-mail -> E_MAIL
ZIP code 2 -> ZIP_CODE_2
For that purpose I wrote a function that uppercases all symbols, keeps digits and replaces rest of the characters with underscore ('_'
). Then it replaces multiple underscores with just one and trims underscores at both ends.
为此,我编写了一个函数,该函数将所有符号大写,保留数字并用下划线 ( '_'
)替换其余字符。然后它用一个替换多个下划线并在两端修剪下划线。
How do I apply this function (lambda) to the column names in Pandas?
如何将此函数 (lambda) 应用于Pandas 中的列名?
回答by EdChum
You can do this without using apply
by calling the vectorised str
methods:
您可以apply
通过调用矢量化str
方法在不使用的情况下执行此操作:
In [62]:
df = pd.DataFrame(columns=['Last Name','e-mail','ZIP code 2'])
df.columns
Out[62]:
Index(['Last Name', 'e-mail', 'ZIP code 2'], dtype='object')
In [63]:
df.columns = df.columns.str.upper().str.replace(' ','_')
df.columns
Out[63]:
Index(['LAST_NAME', 'E-MAIL', 'ZIP_CODE_2'], dtype='object')
Otherwise you can convert the Index
object to a Series
using to_series
so you can use apply
:
否则,您可以将Index
对象转换为Series
using,to_series
以便您可以使用apply
:
In [67]:
def func(x):
return x.upper().replace(' ','_')
df.columns = df.columns.to_series().apply(func)
df
Out[67]:
Empty DataFrame
Columns: [LAST_NAME, E-MAIL, ZIP_CODE_2]
Index: []
Thanks to @PaulH for suggesting using rename
with a lambda
:
感谢@PaulH 建议使用rename
a lambda
:
In [68]:
df.rename(columns=lambda c: c.upper().replace(' ','_'), inplace=True)
df.columns
Out[68]:
Index(['LAST_NAME', 'E-MAIL', 'ZIP_CODE_2'], dtype='object')
回答by Willem Van Onsem
You can simply set the .columns
property of the data frame. So in order to rename it, you can use:
您可以简单地设置.columns
数据框的属性。因此,为了重命名它,您可以使用:
df.columns = list(map(yourlambda,df.columns))
Where you of course replace yourlambda
with your lambda expression.
您当然可以yourlambda
用您的 lambda 表达式替换。