pandas 为 Dataframe 的特定列添加前缀

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/39772896/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 02:06:39  来源:igfitidea点击:

Add prefix to specific columns of Dataframe

pythonpandas

提问by Geo-x

I've a DataFrame like that :

我有一个这样的数据帧:

col1   col2   col3   col4   col5   col6   col7   col8
0      5345   rrf    rrf    rrf    rrf    rrf    rrf
1      2527   erfr   erfr   erfr   erfr   erfr   erfr
2      2727   f      f      f      f      f      f

I would like to rename all columns but not col1and col2.

我想重命名所有列,但不重命名col1col2

So I tried to make a loop

所以我试着做一个循环

print(df.columns)
    for col in df.columns:
        if col != 'col1' and col != 'col2':
            col.rename = str(col) + '_x'

But it's not very efficient...it doesn't work !

但它不是很有效......它不起作用!

回答by A.Kot

You can use the DataFrame.rename()method

您可以使用DataFrame.rename()方法

new_names = [(i,i+'_x') for i in df.iloc[:, 2:].columns.values]
df.rename(columns = dict(new_names), inplace=True)

回答by jezrael

Simpliest solution if col1and col2are first and second column names:

如果col1col2是第一列和第二列名称,则最简单的解决方案:

df.columns = df.columns[:2].union(df.columns[2:]  + '_x')
print (df)
   col1  col2 col3_x col4_x col5_x col6_x col7_x col8_x
0     0  5345    rrf    rrf    rrf    rrf    rrf    rrf
1     1  2527   erfr   erfr   erfr   erfr   erfr   erfr
2     2  2727      f      f      f      f      f      f

Another solution with isinor list comprehension:

具有isin或列表理解的另一种解决方案:

cols = df.columns[~df.columns.isin(['col1','col2'])]
print (cols)
['col3', 'col4', 'col5', 'col6', 'col7', 'col8']

df.rename(columns = dict(zip(cols, cols + '_x')), inplace=True)

print (df)

   col1  col2 col3_x col4_x col5_x col6_x col7_x col8_x
0     0  5345    rrf    rrf    rrf    rrf    rrf    rrf
1     1  2527   erfr   erfr   erfr   erfr   erfr   erfr
2     2  2727      f      f      f      f      f      f


cols = [col for col in df.columns if col not in ['col1', 'col2']]
print (cols)
['col3', 'col4', 'col5', 'col6', 'col7', 'col8']

df.rename(columns = dict(zip(cols, cols + '_x')), inplace=True)

print (df)

   col1  col2 col3_x col4_x col5_x col6_x col7_x col8_x
0     0  5345    rrf    rrf    rrf    rrf    rrf    rrf
1     1  2527   erfr   erfr   erfr   erfr   erfr   erfr
2     2  2727      f      f      f      f      f      f

The fastest is list comprehension:

最快的是列表理解:

df.columns = [col+'_x' if col != 'col1' and col != 'col2' else col for col in df.columns]

Timings:

时间

In [350]: %timeit (akot(df))
1000 loops, best of 3: 387 μs per loop

In [351]: %timeit (jez(df1))
The slowest run took 4.12 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 207 μs per loop

In [363]: %timeit (jez3(df2))
The slowest run took 6.41 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 75.7 μs per loop


df1 = df.copy()
df2 = df.copy()

def jez(df):
    df.columns = df.columns[:2].union(df.columns[2:]  + '_x')
    return df

def akot(df):
    new_names = [(i,i+'_x') for i in df.iloc[:, 2:].columns.values]
    df.rename(columns = dict(new_names), inplace=True)
    return df


def jez3(df):
   df.columns = [col + '_x' if col != 'col1' and col != 'col2' else col for col in df.columns]
   return df


print (akot(df))
print (jez(df1))
print (jez2(df1))

回答by EdChum

You can use str.containswith a regex pattern to filter the cols of interest, then using zipconstruct a dict and pass this as the arg to rename:

您可以使用str.contains正则表达式模式来过滤感兴趣的列,然后使用zip构造一个 dict 并将其作为 arg 传递给rename

In [94]:
cols = df.columns[~df.columns.str.contains('col1|col2')]
df.rename(columns = dict(zip(cols, cols + '_x')), inplace=True)
df

Out[94]:
   col1  col2 col3_x col4_x col5_x col6_x col7_x col8_x
0     0  5345    rrf    rrf    rrf    rrf    rrf    rrf
1     1  2527   erfr   erfr   erfr   erfr   erfr   erfr
2     2  2727      f      f      f      f      f      f

So here using str.containsto filter the columns will return the columns that don't match so the column order is irrelevant

所以这里使用str.contains过滤列将返回不匹配的列,因此列顺序无关紧要