python pandas-将带有两个参数的函数应用于列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/34279378/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
python pandas- apply function with two arguments to columns
提问by Maria
Can you make a python pandas function with values in two different columns as arguments?
你能用两个不同列中的值作为参数来创建一个 python pandas 函数吗?
I have a function that returns a 1 if two columns have values in the same range. otherwise it returns 0:
如果两列的值在同一范围内,我有一个函数返回 1。否则返回 0:
def segmentMatch(RealTime, ResponseTime):
if RealTime <= 566 and ResponseTime <= 566:
matchVar = 1
elif 566 < RealTime <= 1132 and 566 < ResponseTime <= 1132:
matchVar = 1
elif 1132 < RealTime <= 1698 and 1132 < ResponseTime <= 1698:
matchVar = 1
else:
matchVar = 0
return matchVar
I want the first argument, RealTime
, to be a column in my data frame, such that the function will take the value of each row in that column. e.g. RealTime
is df['TimeCol']
and the second argument is df['ResponseCol']`. And I'd like the result to be a new column in the dataframe. I came across severalthreadsthat have answered a similar question, but it looks like those arguments were variables, not values in rows of the dataframe.
我希望第一个参数 ,RealTime
成为我的数据框中的一列,以便该函数将采用该列中每一行的值。例如RealTime
是df['TimeCol']
,第二个参数是 df['ResponseCol']`。我希望结果是数据框中的一个新列。我遇到了几个回答类似问题的线程,但看起来这些参数是变量,而不是数据帧行中的值。
I tried the following but it didn't work:
我尝试了以下但没有奏效:
df['NewCol'] = df.apply(segmentMatch, args=(df['TimeCol'], df['ResponseCol']), axis=1)
采纳答案by N. Wouda
Why not just do this?
为什么不这样做呢?
df['NewCol'] = df.apply(lambda x: segmentMatch(x['TimeCol'], x['ResponseCol']), axis=1)
Rather than trying to pass the column as an argument as in your example, we now simply pass the appropriate entries in each row as argument, and store the result in 'NewCol'
.
与您的示例中尝试将列作为参数传递不同,我们现在只需将每行中的适当条目作为参数传递,并将结果存储在'NewCol'
.
回答by rahul
You don't really need a lambda function if you are defining the function outside:
如果您在外部定义函数,则实际上并不需要 lambda 函数:
def segmentMatch(vec):
RealTime = vec[0]
ResponseTime = vec[1]
if RealTime <= 566 and ResponseTime <= 566:
matchVar = 1
elif 566 < RealTime <= 1132 and 566 < ResponseTime <= 1132:
matchVar = 1
elif 1132 < RealTime <= 1698 and 1132 < ResponseTime <= 1698:
matchVar = 1
else:
matchVar = 0
return matchVar
df['NewCol'] = df[['TimeCol', 'ResponseCol']].apply(segmentMatch, axis=1)
If "segmentMatch" were to return a vector of 2 values instead, you could do the following:
如果“segmentMatch”要返回一个包含 2 个值的向量,您可以执行以下操作:
def segmentMatch(vec):
......
return pd.Series((matchVar1, matchVar2))
df[['NewCol', 'NewCol2']] = df[['TimeCol','ResponseCol']].apply(segmentMatch, axis=1)