Pandas:如何将函数应用于列名

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/42788311/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:11:16  来源:igfitidea点击:

Pandas: how to apply function to column names

pythonpandas

提问by Denis Kulagin

I would like all columns to be named in a uniform manner, like:

我希望以统一的方式命名所有列,例如:

Last Name -> LAST_NAME
e-mail -> E_MAIL
ZIP code 2 -> ZIP_CODE_2

For that purpose I wrote a function that uppercases all symbols, keeps digits and replaces rest of the characters with underscore ('_'). Then it replaces multiple underscores with just one and trims underscores at both ends.

为此,我编写了一个函数,该函数将所有符号大写,保留数字并用下划线 ( '_')替换其余字符。然后它用一个替换多个下划线并在两端修剪下划线。

How do I apply this function (lambda) to the column names in Pandas?

如何将此函数 (lambda) 应用于Pandas 中的列名?

回答by EdChum

You can do this without using applyby calling the vectorised strmethods:

您可以apply通过调用矢量化str方法在不使用的情况下执行此操作:

In [62]:
df = pd.DataFrame(columns=['Last Name','e-mail','ZIP code 2'])
df.columns

Out[62]:
Index(['Last Name', 'e-mail', 'ZIP code 2'], dtype='object')

In [63]:    
df.columns = df.columns.str.upper().str.replace(' ','_')
df.columns    

Out[63]:
Index(['LAST_NAME', 'E-MAIL', 'ZIP_CODE_2'], dtype='object')

Otherwise you can convert the Indexobject to a Seriesusing to_seriesso you can use apply:

否则,您可以将Index对象转换为Seriesusing,to_series以便您可以使用apply

In [67]:
def func(x):
    return x.upper().replace(' ','_')
df.columns = df.columns.to_series().apply(func)
df

Out[67]:
Empty DataFrame
Columns: [LAST_NAME, E-MAIL, ZIP_CODE_2]
Index: []

Thanks to @PaulH for suggesting using renamewith a lambda:

感谢@PaulH 建议使用renamea lambda

In [68]:
df.rename(columns=lambda c: c.upper().replace(' ','_'), inplace=True)
df.columns

Out[68]:
Index(['LAST_NAME', 'E-MAIL', 'ZIP_CODE_2'], dtype='object')

回答by Willem Van Onsem

You can simply set the .columnsproperty of the data frame. So in order to rename it, you can use:

您可以简单地设置.columns数据框的属性。因此,为了重命名它,您可以使用:

df.columns = list(map(yourlambda,df.columns))

Where you of course replace yourlambdawith your lambda expression.

您当然可以yourlambda用您的 lambda 表达式替换。