pandas 在多列熊猫上应用 lambda 行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/51080174/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
applying lambda row on multiple columns pandas
提问by muni
I am creating a sample dataframe:
我正在创建一个示例数据框:
tp = pd.DataFrame({'source':['a','s','f'],
'target':['b','n','m'],
'count':[0,8,4]})
And creating a column 'col' based on condition of 'target' column >> same as source, if matching condition, else to a default, as below:
并根据“目标”列的条件创建列“col”>>与源相同,如果匹配条件,则为默认值,如下所示:
tp['col'] = tp.apply(lambda row:row['source'] if row['target'] in ['b','n'] else 'x')
But it's throwing me this error: KeyError: ('target', 'occurred at index count')
但它给我这个错误: KeyError: ('target', 'occurred at index count')
How can I make it work, without defining a function?
如何在不定义函数的情况下使其工作?
回答by jpp
As per @Zero's comment, you need to use axis=1
to tell Pandas you want to apply a function to each row. The default is axis=0
.
根据@Zero 的评论,您需要使用axis=1
来告诉 Pandas 您想对每一行应用一个函数。默认为axis=0
。
tp['col'] = tp.apply(lambda row: row['source'] if row['target'] in ['b', 'n'] else 'x',
axis=1)
However, for this specific task, you should use vectorised operations. For example, using numpy.where
:
但是,对于此特定任务,您应该使用矢量化操作。例如,使用numpy.where
:
tp['col'] = np.where(tp['target'].isin(['b', 'n']), tp['source'], 'x')
pd.Series.isin
returns a Boolean series which tells numpy.where
whether to select the second or third argument.
pd.Series.isin
返回一个布尔系列,它告诉您numpy.where
是选择第二个还是第三个参数。