从 Pandas 的列标题中删除前缀(或后缀)子字符串

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/55679401/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 06:22:43  来源:igfitidea点击:

Remove prefix (or suffix) substring from column headers in pandas

pythonpandas

提问by user9185511

I'm trying to remove the sub string _x that is located in the end of part of my df column names.

我正在尝试删除位于我的 df 列名称部分末尾的子字符串 _x。

Sample df code:

示例 df 代码:

import pandas as pd

d = {'W_x': ['abcde','abcde','abcde']}
df = pd.DataFrame(data=d)

df['First_x']=[0,0,0]
df['Last_x']=[1,2,3]
df['Slice']=['abFC=0.01#%sdadf','12fdak*4%FC=-0.035faf,dd43','FC=0.5fasff']

output:

输出:

     W_x  First_x Last_x                 Slice
0  abcde      0     1                   abFC=0.01
1  abcde      0     2  12fdak*4%FC=-0.035faf,dd43
2  abcde      0     3                 FC=0.5fasff

Desired output:

期望的输出:

       W  First  Last                       Slice
0  abcde      0     1                   abFC=0.01
1  abcde      0     2  12fdak*4%FC=-0.035faf,dd43
2  abcde      0     3                 FC=0.5fasff

采纳答案by dmontaner

I usually use @cs95 way but wrapping it in a data frame method just for convenience:

我通常使用 @cs95 方式,但为了方便起见,将其包装在数据框方法中:

import pandas as pd

def drop_prefix(self, prefix):
    self.columns = self.columns.str.lstrip(prefix)
    return self

pd.core.frame.DataFrame.drop_prefix = drop_prefix

Then you can use it as with inverse method already implemented in pandas add_prefix:

然后您可以将其与已在 pandas 中实现的逆方法一起使用add_prefix

pd.drop_prefix('myprefix_')

回答by cs95

Use str.strip/rstrip:

使用str.strip/ rstrip:

# df.columns = df.columns.str.strip('_x')
# Or, 
df.columns = df.columns.str.rstrip('_x')  # strip suffix at the right end only.

df.columns
# Index(['W', 'First', 'Last', 'Slice'], dtype='object')


To avoid the issue highlighted in the comments:

为避免评论中突出显示的问题:

Beware of strip() if any column name starts or ends with either _ or x beyond the suffix.

如果任何列名称以 _ 或 x 超出后缀开头或结尾,请注意 strip() 。

You could use str.replace,

你可以用str.replace

df.columns = df.columns.str.replace(r'_x$', '')

df.columns
# Index(['W', 'First', 'Last', 'Slice'], dtype='object')

回答by Quang Hoang

df.columns = [col[:-2] for col in df.columns if col[-2:]=='_x' else col]

or

或者

df.columns = [col.replace('_x', '') for col in df.columns]

回答by Quickbeam2k1

I'd suggest to use the renamefunction:

我建议使用该rename功能:

df.rename(columns = lambda x: x.strip('_x'))

Output is as desired

输出符合要求

Of yourse you can also take care of FabienP's comment and modify if according to Quang Hoang's solution:

您还可以根据 Quang Hoang 的解决方案处理 FabienP 的评论并进行修改:

df.rename(columns = lambda x: x.replace('_x$', ''))

gives the desired output.

给出所需的输出。

Another solution is simply:

另一种解决方案很简单:

df.rename(columns = lambda x: x[:-2] if x.endswith('_x') else x)