Python 应用具有多个参数的函数来创建新的 Pandas 列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/19914937/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Applying function with multiple arguments to create a new pandas column
提问by Michael
I want to create a new column in a pandas
data frame by applying a function to two existing columns. Following this answerI've been able to create a new column when I only need one column as an argument:
我想pandas
通过将函数应用于两个现有列来在数据框中创建一个新列。按照这个答案,当我只需要一列作为参数时,我已经能够创建一个新列:
import pandas as pd
df = pd.DataFrame({"A": [10,20,30], "B": [20, 30, 10]})
def fx(x):
return x * x
print(df)
df['newcolumn'] = df.A.apply(fx)
print(df)
However, I cannot figure out how to do the same thing when the function requires multiple arguments. For example, how do I create a new column by passing column A and column B to the function below?
但是,当函数需要多个参数时,我无法弄清楚如何做同样的事情。例如,如何通过将 A 列和 B 列传递给下面的函数来创建新列?
def fxy(x, y):
return x * y
采纳答案by alko
Alternatively, you can use numpy underlying function:
或者,您可以使用 numpy 底层函数:
>>> import numpy as np
>>> df = pd.DataFrame({"A": [10,20,30], "B": [20, 30, 10]})
>>> df['new_column'] = np.multiply(df['A'], df['B'])
>>> df
A B new_column
0 10 20 200
1 20 30 600
2 30 10 300
or vectorize arbitrary function in general case:
或在一般情况下向量化任意函数:
>>> def fx(x, y):
... return x*y
...
>>> df['new_column'] = np.vectorize(fx)(df['A'], df['B'])
>>> df
A B new_column
0 10 20 200
1 20 30 600
2 30 10 300
回答by greenafrican
This solves the problem:
这解决了这个问题:
df['newcolumn'] = df.A * df.B
You could also do:
你也可以这样做:
def fab(row):
return row['A'] * row['B']
df['newcolumn'] = df.apply(fab, axis=1)
回答by Roman Pekar
You can go with @greenAfrican example, if it's possible for you to rewrite your function. But if you don't want to rewrite your function, you can wrap it into anonymous function inside apply, like this:
如果您可以重写您的函数,您可以使用@greenAfrican 示例。但是如果你不想重写你的函数,你可以把它包装到apply里面的匿名函数中,像这样:
>>> def fxy(x, y):
... return x * y
>>> df['newcolumn'] = df.apply(lambda x: fxy(x['A'], x['B']), axis=1)
>>> df
A B newcolumn
0 10 20 200
1 20 30 600
2 30 10 300
回答by Surya
One more dict style clean syntax:
另一种 dict 风格的干净语法:
df["new_column"] = df.apply(lambda x: x["A"] * x["B"], axis = 1)
or,
或者,
df["new_column"] = df["A"] * df["B"]
回答by toto_tico
If you need to create multiple columns at once:
如果您需要一次创建多个列:
Create the dataframe:
import pandas as pd df = pd.DataFrame({"A": [10,20,30], "B": [20, 30, 10]})
Create the function:
def fab(row): return row['A'] * row['B'], row['A'] + row['B']
Assign the new columns:
df['newcolumn'], df['newcolumn2'] = zip(*df.apply(fab, axis=1))
创建数据框:
import pandas as pd df = pd.DataFrame({"A": [10,20,30], "B": [20, 30, 10]})
创建函数:
def fab(row): return row['A'] * row['B'], row['A'] + row['B']
分配新列:
df['newcolumn'], df['newcolumn2'] = zip(*df.apply(fab, axis=1))