pandas 在多列熊猫上应用 lambda 行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/51080174/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 05:45:14  来源:igfitidea点击:

applying lambda row on multiple columns pandas

pythonpandasdataframeif-statementlambda

提问by muni

I am creating a sample dataframe:

我正在创建一个示例数据框:

tp = pd.DataFrame({'source':['a','s','f'], 
                   'target':['b','n','m'], 
                   'count':[0,8,4]})

And creating a column 'col' based on condition of 'target' column >> same as source, if matching condition, else to a default, as below:

并根据“目标”列的条件创建列“col”>>与源相同,如果匹配条件,则为默认值,如下所示:

tp['col'] = tp.apply(lambda row:row['source'] if row['target'] in ['b','n'] else 'x')

But it's throwing me this error: KeyError: ('target', 'occurred at index count')

但它给我这个错误: KeyError: ('target', 'occurred at index count')

How can I make it work, without defining a function?

如何在不定义函数的情况下使其工作?

回答by jpp

As per @Zero's comment, you need to use axis=1to tell Pandas you want to apply a function to each row. The default is axis=0.

根据@Zero 的评论,您需要使用axis=1来告诉 Pandas 您想对每一行应用一个函数。默认为axis=0

tp['col'] = tp.apply(lambda row: row['source'] if row['target'] in ['b', 'n'] else 'x',
                     axis=1)


However, for this specific task, you should use vectorised operations. For example, using numpy.where:

但是,对于此特定任务,您应该使用矢量化操作。例如,使用numpy.where

tp['col'] = np.where(tp['target'].isin(['b', 'n']), tp['source'], 'x')

pd.Series.isinreturns a Boolean series which tells numpy.wherewhether to select the second or third argument.

pd.Series.isin返回一个布尔系列,它告诉您numpy.where是选择第二个还是第三个参数。