Python 用另一个值替换熊猫数据框列中的几个值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/27060098/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 01:22:54  来源:igfitidea点击:

Replacing few values in a pandas dataframe column with another value

pythonreplacepandasdataframe

提问by Pulkit Jha

I have a pandas dataframe df as illustrated below:

我有一个熊猫数据框 df 如下图所示:

BrandName Specialty
A          H
B          I
ABC        J
D          K
AB         L

I want to replace 'ABC' and 'AB' in column BrandName by A. Can someone help with this?

我想用 A 替换 BrandName 列中的“ABC”和“AB”。有人可以帮忙吗?

采纳答案by Alex Riley

The easiest way is to use the replacemethod on the column. The arguments are a list of the things you want to replace (here ['ABC', 'AB']) and what you want to replace them with (the string 'A'in this case):

最简单的方法是replace在色谱柱上使用方法。参数是要替换的内容(此处['ABC', 'AB'])以及要替换的内容('A'在本例中为字符串)的列表:

>>> df['BrandName'].replace(['ABC', 'AB'], 'A')
0    A
1    B
2    A
3    D
4    A

This creates a new Series of values so you need to assign this new column to the correct column name:

这将创建一个新的系列值,因此您需要将此新列分配给正确的列名:

df['BrandName'] = df['BrandName'].replace(['ABC', 'AB'], 'A')

回答by I159

Replace

代替

DataFrameobject has powerful and flexible replacemethod:

DataFrameobject 具有强大而灵活的replace方法:

DataFrame.replace(
        to_replace=None,
        value=None,
        inplace=False,
        limit=None,
        regex=False, 
        method='pad',
        axis=None)

Note, if you need to make changes in place, use inplaceboolean argument for replacemethod:

请注意,如果您需要进行适当的更改,请inplacereplace方法使用布尔参数:

Inplace

到位

inplace: boolean, default FalseIf True, in place. Note: this will modify any other views on this object (e.g. a column form a DataFrame). Returns the caller if this is True.

就地:布尔值,默认False如果True,就地。注意:这将修改此对象上的任何其他视图(例如,DataFrame 中的列)。如果是 ,则返回调用者True

Snippet

片段

df['BrandName'].replace(
    to_replace=['ABC', 'AB'],
    value='A',
    inplace=True
)

回答by Namrata Tolani

This solution will change the existing dataframe itself:

此解决方案将更改现有数据帧本身:

mydf = pd.DataFrame({"BrandName":["A", "B", "ABC", "D", "AB"], "Speciality":["H", "I", "J", "K", "L"]})
mydf["BrandName"].replace(["ABC", "AB"], "A", inplace=True)

回答by Saurabh

loc function can be used to replace multiple values, Documentation for it : loc

loc 函数可用于替换多个值,相关文档: loc

df.loc[df['BrandName'].isin(['ABC', 'AB'])]='A'

回答by shubham ranjan

Created the Data frame:

创建数据框:

import pandas as pd
dk=pd.DataFrame({"BrandName":['A','B','ABC','D','AB'],"Specialty":['H','I','J','K','L']})

Now use DataFrame.replace()function:

现在使用DataFrame.replace()功能:

dk.BrandName.replace(to_replace=['ABC','AB'],value='A')

回答by Claudiu Creanga

Just wanted to show that there is no performancedifference between the 2 main ways of doing it:

只是想表明两种主要方法之间没有性能差异:

df = pd.DataFrame(np.random.randint(0,10,size=(100, 4)), columns=list('ABCD'))

def loc():
    df1.loc[df1["A"] == 2] = 5
%timeit loc
19.9 ns ± 0.0873 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


def replace():
    df2['A'].replace(
        to_replace=2,
        value=5,
        inplace=True
    )
%timeit replace
19.6 ns ± 0.509 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)