返回两个新列的 Pandas Apply 函数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/47969756/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:57:29  来源:igfitidea点击:

Pandas Apply Function That returns two new columns

pythonpython-2.7pandas

提问by user2242044

I have a pandasdataframe that I would like to use an apply function on to generate two new columns based on the existing data. I am getting this error: ValueError: Wrong number of items passed 2, placement implies 1

我有一个pandas数据框,我想使用应用函数根据现有数据生成两个新列。我收到此错误: ValueError: Wrong number of items passed 2, placement implies 1

import pandas as pd
import numpy as np

def myfunc1(row):
    C = row['A'] + 10
    D = row['A'] + 50
    return [C, D]

df = pd.DataFrame(np.random.randint(0,10,size=(2, 2)), columns=list('AB'))

df['C', 'D'] = df.apply(myfunc1 ,axis=1)

Starting DF:

开始DF:

   A  B
0  6  1
1  8  4

Desired DF:

期望的DF:

   A  B  C   D
0  6  1  16  56
1  8  4  18  58

回答by oim

Based on your latest error, you can avoid the error by returning the new columns as a Series

根据您的最新错误,您可以通过将新列作为系列返回来避免该错误

def myfunc1(row):
    C = row['A'] + 10
    D = row['A'] + 50
    return pd.Series([C, D])

df[['C', 'D']] = df.apply(myfunc1 ,axis=1)

回答by Bharath

df['C','D']is considered as 1 column rather than 2. So for 2 columns you need a sliced dataframe so use df[['C','D']]

df['C','D']被视为 1 列而不是 2。因此对于 2 列,您需要一个切片数据框,因此请使用 df[['C','D']]

df[['C', 'D']] = df.apply(myfunc1 ,axis=1)

    A  B   C   D
0  4  6  14  54
1  5  1  15  55

Or you can use chain assignment i.e

或者您可以使用链分配即

df['C'], df['D'] = df.apply(myfunc1 ,axis=1)

回答by gabe_

Add extra brackets when querying for multiple columns.

查询多列时添加额外的括号。

import pandas as pd
import numpy as np

def myfunc1(row):
    C = row['A'] + 10
    D = row['A'] + 50
    return [C, D]

df = pd.DataFrame(np.random.randint(0,10,size=(2, 2)), columns=list('AB'))

df[['C', 'D']] = df.apply(myfunc1 ,axis=1)

回答by Federico Dorato

Please be aware of the huge memory consumption and low speed of the accepted answer: https://ys-l.github.io/posts/2015/08/28/how-not-to-use-pandas-apply/!

请注意已接受答案的巨大内存消耗和低速:https: //ys-l.github.io/posts/2015/08/28/how-not-to-use-pandas-apply/

Using the suggestion presented there, the correct answer would be like this:

使用那里提出的建议,正确的答案是这样的:

def run_loopy(df):
    Cs, Ds = [], []
    for _, row in df.iterrows():
        c, d, = myfunc1(row['A'])
        Cs.append(c)
        Ds.append(d)
    df_result = pd.DataFrame({'C': v1s,
                              'D': v2s})

def myfunc1(a):
    c = a + 10
    d = a + 50
    return c,d

df[['C', 'D']] = run_loopy(df)