pandas 根据pandas中的另一个列值有条件地填充列值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/10715519/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Conditionally fill column values based on another columns value in pandas
提问by Jan Willem Tulp
I have a DataFrame
with a few columns. One columns contains a symbol for which currency is being used, for instance a euro or a dollar sign. Another column contains a budget value. So for instance in one row it could mean a budget of 5000 in euro and in the next row it could say a budget of 2000 in dollar.
我有DataFrame
几列。一列包含使用货币的符号,例如欧元或美元符号。另一列包含预算值。因此,例如,在一行中,它可能表示 5000 欧元的预算,而在下一行中可能表示 2000 美元的预算。
In pandas I would like to add an extra column to my DataFrame, normalizing the budgets in euro. So basically, for each row the value in the new column should be the value from the budget column * 1 if the symbol in the currency column is a euro sign, and the value in the new column should be the value of the budget column * 0.78125 if the symbol in the currency column is a dollar sign.
在 Pandas 中,我想在我的 DataFrame 中添加一个额外的列,以欧元标准化预算。所以基本上,对于每一行,如果货币列中的符号是欧元符号,则新列中的值应该是预算列中的值 * 1,而新列中的值应该是预算列中的值 *如果货币列中的符号是美元符号,则为 0.78125。
I know how to add a column, fill it with values, copy values from another column etc. but not how to fill the new column conditionally based on the value of another column.
我知道如何添加一列,用值填充它,从另一列复制值等,但不知道如何根据另一列的值有条件地填充新列。
Any suggestions?
有什么建议?
回答by Wes McKinney
You probably want to do
你可能想做
df['Normalized'] = np.where(df['Currency'] == '$', df['Budget'] * 0.78125, df['Budget'])
回答by Thomas Kimber
Similar results via an alternate style might be to write a function that performs the operation you want on a row, using row['fieldname']
syntax to access individual values/columns, and then perform a DataFrame.applymethod upon it
通过替代样式的类似结果可能是编写一个函数来在行上执行您想要的操作,使用row['fieldname']
语法访问单个值/列,然后对其执行DataFrame.apply方法
This echoes the answer to the question linked here: pandas create new column based on values from other columns
这与此处链接的问题的答案相呼应:pandas create new column based on values from other columns
def normalise_row(row):
if row['Currency'] == '$'
...
...
...
return result
df['Normalized'] = df.apply(lambda row : normalise_row(row), axis=1)
回答by Artem Yevtushenko
Taking Tom Kimber's suggestion one step further, you could use a Function Dictionary to set various conditions for your functions. This solution is expanding the scope of the question.
将 Tom Kimber 的建议更进一步,您可以使用函数字典为您的函数设置各种条件。该解决方案正在扩大问题的范围。
I'm using an example from a personal application.
我正在使用个人应用程序中的示例。
# write the dictionary
def applyCalculateSpend (df_name, cost_method_col, metric_col, rate_col, total_planned_col):
calculations = {
'CPMV' : df_name[metric_col] / 1000 * df_name[rate_col],
'Free' : 0
}
df_method = df_name[cost_method_col]
return calculations.get(df_method, "not in dict")
# call the function inside a lambda
test_df['spend'] = test_df.apply(lambda row: applyCalculateSpend(
row,
cost_method_col='cost method',
metric_col='metric',
rate_col='rate',
total_planned_col='total planned'), axis = 1)
cost method metric rate total planned spend
0 CPMV 2000 100 1000 200.0
1 CPMV 4000 100 1000 400.0
4 Free 1 2 3 0.0