从 Pandas 的列标题中删除前缀(或后缀)子字符串
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/55679401/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Remove prefix (or suffix) substring from column headers in pandas
提问by user9185511
I'm trying to remove the sub string _x that is located in the end of part of my df column names.
我正在尝试删除位于我的 df 列名称部分末尾的子字符串 _x。
Sample df code:
示例 df 代码:
import pandas as pd
d = {'W_x': ['abcde','abcde','abcde']}
df = pd.DataFrame(data=d)
df['First_x']=[0,0,0]
df['Last_x']=[1,2,3]
df['Slice']=['abFC=0.01#%sdadf','12fdak*4%FC=-0.035faf,dd43','FC=0.5fasff']
output:
输出:
W_x First_x Last_x Slice
0 abcde 0 1 abFC=0.01
1 abcde 0 2 12fdak*4%FC=-0.035faf,dd43
2 abcde 0 3 FC=0.5fasff
Desired output:
期望的输出:
W First Last Slice
0 abcde 0 1 abFC=0.01
1 abcde 0 2 12fdak*4%FC=-0.035faf,dd43
2 abcde 0 3 FC=0.5fasff
采纳答案by dmontaner
I usually use @cs95 way but wrapping it in a data frame method just for convenience:
我通常使用 @cs95 方式,但为了方便起见,将其包装在数据框方法中:
import pandas as pd
def drop_prefix(self, prefix):
self.columns = self.columns.str.lstrip(prefix)
return self
pd.core.frame.DataFrame.drop_prefix = drop_prefix
Then you can use it as with inverse method already implemented in pandas add_prefix
:
然后您可以将其与已在 pandas 中实现的逆方法一起使用add_prefix
:
pd.drop_prefix('myprefix_')
回答by cs95
Use str.strip
/rstrip
:
使用str.strip
/ rstrip
:
# df.columns = df.columns.str.strip('_x')
# Or,
df.columns = df.columns.str.rstrip('_x') # strip suffix at the right end only.
df.columns
# Index(['W', 'First', 'Last', 'Slice'], dtype='object')
To avoid the issue highlighted in the comments:
为避免评论中突出显示的问题:
Beware of strip() if any column name starts or ends with either _ or x beyond the suffix.
如果任何列名称以 _ 或 x 超出后缀开头或结尾,请注意 strip() 。
You could use str.replace
,
你可以用str.replace
,
df.columns = df.columns.str.replace(r'_x$', '')
df.columns
# Index(['W', 'First', 'Last', 'Slice'], dtype='object')
回答by Quang Hoang
df.columns = [col[:-2] for col in df.columns if col[-2:]=='_x' else col]
or
或者
df.columns = [col.replace('_x', '') for col in df.columns]
回答by Quickbeam2k1
I'd suggest to use the rename
function:
我建议使用该rename
功能:
df.rename(columns = lambda x: x.strip('_x'))
Output is as desired
输出符合要求
Of yourse you can also take care of FabienP's comment and modify if according to Quang Hoang's solution:
您还可以根据 Quang Hoang 的解决方案处理 FabienP 的评论并进行修改:
df.rename(columns = lambda x: x.replace('_x$', ''))
gives the desired output.
给出所需的输出。
Another solution is simply:
另一种解决方案很简单:
df.rename(columns = lambda x: x[:-2] if x.endswith('_x') else x)