pandas 根据同一行的其他列中的值将函数应用于数据框列元素?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/41962022/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 02:53:35  来源:igfitidea点击:

Apply function to dataframe column element based on value in other column for same row?

pythonpandasnumpy

提问by Chuck

I have a dataframe:

我有一个数据框:

df = pd.DataFrame(
    {'number': ['10', '20' , '30', '40'], 'condition': ['A', 'B', 'A', 'B']})

df = 
    number    condition
0    10         A
1    20         B
2    30         A
3    40         B

I want to apply a function to each element within the number column, as follows:

我想对数字列中的每个元素应用一个函数,如下所示:

 df['number'] = df['number'].apply(lambda x: func(x))

BUT, even though I apply the function to the number column, I want the function to also make reference to the conditioncolumn i.e. in pseudo code:

但是,即使我将该函数应用于 number 列,我也希望该函数也引用该condition列,即在伪代码中:

func(n):
    #if the value in corresponding condition column is equal to some set of values:
        # do some stuff to n using the value in condition
        # return new value for n

For a single number, and an example function I would write:

对于单个数字和示例函数,我会写:

number = 10
condition = A
def func(num, condition):
    if condition == A:
        return num*3
    if condition == B:
        return num*4

func(number,condition) = 15

How can I incorporate the same function to my applystatement written above? i.e. making reference to the value within the condition column, while acting on the value within the number column?

如何将相同的功能合并到我apply上面写的语句中?即引用条件列中的值,同时对数字列中的值进行操作?

Note: I have read through the docs on np.where(), pandas.loc()and pandas.index()but I just cannot figure out how to put it into practice.

注:我已经通过对文档阅读np.where()pandas.loc()并且pandas.index()可我就是不知道怎样把它付诸实践。

I am struggling with the syntax for referencing the other column from within the function, as I need access to both the values in the numberand conditioncolumn.

我正在努力使用从函数中引用另一列的语法,因为我需要访问numbercondition列中的值。

As such, my expected output is:

因此,我的预期输出是:

df = 
    number    condition
0    30         A
1    80         B
2    90         A
3    160         B

UPDATE: The above was far too vague. Please see the following:

更新:以上内容太含糊了。请参阅以下内容:

df1 = pd.DataFrame({'Entries':['man','guy','boy','girl'],'Conflict':['Yes','Yes','Yes','No']})


    Entries    Conflict
0    "man"    "Yes"
1    "guy"    "Yes"
2    "boy"    "Yes"
3    "girl"   "No

def funcA(d):
    d = d + 'aaa'
    return d
def funcB(d):
    d = d + 'bbb'
    return d

df1['Entries'] = np.where(df1['Conflict'] == 'Yes', funcA, funcB)

Output:
{'Conflict': ['Yes', 'Yes', 'Yes', 'Np'],
 'Entries': array(<function funcB at 0x7f4acbc5a500>, dtype=object)}

How can I apply the above np.where statement to take a pandas series as mentioned in the comments, and produce the desired output shown below:

我如何应用上面的 np.where 语句来获取评论中提到的Pandas系列,并产生如下所示的所需输出:

Desired Output:

期望输出:

    Entries    Conflict
0    "manaaa"    "Yes"
1    "guyaaa"    "Yes"
2    "boyaaa"    "Yes"
3    "girlbbb"   "No

采纳答案by blacksite

I don't know about using pandas.DataFrame.apply, but you could define a certain condition:multiplierkey-value mapping (seen in multiplierbelow), and pass that into your function. Then you can use a list comprehension to calculate the new numberoutput based on those conditions:

我不知道如何使用pandas.DataFrame.apply,但您可以定义某个condition:multiplier键值映射(multiplier如下所示),并将其传递到您的函数中。然后您可以使用列表推导number根据这些条件计算新输出:

import pandas as pd
df = pd.DataFrame({'number': [10, 20 , 30, 40], 'condition': ['A', 'B', 'A', 'B']})

multiplier = {'A': 2, 'B': 4}

def func(num, condition, multiplier):
    return num * multiplier[condition]

df['new_number'] = [func(df.loc[idx, 'number'], df.loc[idx, 'condition'], 
                     multiplier) for idx in range(len(df))]

Here's the result:

结果如下:

df
Out[24]: 
  condition  number  new_number
0         A      10          30
1         B      20          80
2         A      30          90
3         B      40         160

There is likely a vectorized, pure-pandas solution that's more "ideal." But this works, too, in a pinch.

可能有一种更“理想”的矢量化纯Pandas解决方案。但这也适用于紧要关头。

回答by Rene B.

As the question was in regard to the applyfunction to a dataframe column for the same row, it seems more accurate to use the pandas applyfuntion in combination with lambda:

由于问题是关于函数应用于同一行的数据框列,因此apply结合使用Pandas功能似乎更准确lambda

import pandas as pd
df = pd.DataFrame({'number': [10, 20 , 30, 40], 'condition': ['A', 'B', 'A', 'B']})

def func(number,condition):
    multiplier = {'A': 2, 'B': 4}
    return number * multiplier[condition]

df['new_number'] = df.apply(lambda x: func(x['number'], x['condition']), axis=1)

In this example, lambdatakes the columns 'number'and 'condition'of the dataframe df and applies these columns of the same row to the function funcwith apply.

在此示例中,lambda采用数据框 df的列'number''condition'并将同一行的这些列应用到函数funcwith apply

This returns the following result:

这将返回以下结果:

df
Out[10]: 
 condition  number  new_number
0   A   10  20
1   B   20  80
2   A   30  60
3   B   40  160

For the UPDATE caseits also possible to use the pandas applyfunction:

对于UPDATE 情况,也可以使用 pandasapply函数:

df1 = pd.DataFrame({'Entries':['man','guy','boy','girl'],'Conflict':['Yes','Yes','Yes','No']})

def funcA(d):
    d = d + 'aaa'
    return d
def funcB(d):
    d = d + 'bbb'
    return d

df1['Entries'] = df1.apply(lambda x: funcA(x['Entries']) if x['Conflict'] == 'Yes' else funcB(x['Entries']), axis=1)

In this example, lambdatakes the columns 'Entries'and 'Conflict'of the dataframe df and applies these columns either to funcAor funcBof the same row with apply. The condition if funcAor funcBwill be applied is done with an if-elseclause in lambda.

在此示例中,lambda采用数据df 的“条目”“冲突”列,并将这些列应用到与 相同行的funcAfuncBapply。将应用funcAfuncB的条件是通过if-elselambda 中的子句完成的。

This returns the following result:

这将返回以下结果:

df
Out[12]:


    Conflict    Entries
0   Yes     manaaa
1   Yes     guyaaa
2   Yes     boyaaa
3   No  girlbbb