使用正则表达式为 Pandas 重命名数据框中的列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/26500156/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 22:36:07  来源:igfitidea点击:

renaming column in dataframe for Pandas using regular expression

pythonregexpandas

提问by lokheart

I have a dataframe made by Pandas that I want to remove the empty space at the end of each column name. I tried something like:

我有一个由 Pandas 制作的数据框,我想删除每个列名末尾的空格。我试过类似的东西:

raw_data.columns.values = re.sub(' $','',raw_data.columns.values)

But this is not working, anything I did wrong here?

但这不起作用,我在这里做错了什么?

回答by lokheart

I should have used the repackage:

我应该使用这个re包:

raw_data = raw_data.rename(columns=lambda x: re.sub(' $','',x))

回答by Christian

I would recommend using pandas.Series.str.strip

我建议使用pandas.Series.str.strip

df.columns = df.columns.str.strip()

回答by José

The answer from @Christian is probably right for this specific question, but to the more general question about replacing names in the columns, I would suggest to create a dictionary comprehension and pass it to the rename function:

@Christian 的答案可能适用于这个特定问题,但对于关于替换列中名称的更一般性问题,我建议创建一个字典理解并将其传递给重命名函数:

df.rename(columns={element: re.sub(r'$ (.+)',r'', element, flags = re.MULTILINE) for element in df.columns.tolist()})

In my case, I wanted to add something to the beginning of each column, so:

就我而言,我想在每列的开头添加一些内容,因此:

df.rename(columns={element: re.sub(r'(.+)',r'x_', element) for element in df.columns.tolist()})

You can use the inplace=True parameter to actually make the change in the dataframe.

您可以使用 inplace=True 参数在数据框中实际进行更改。