pandas 根据pandas中的另一个列值有条件地填充列值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/10715519/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 00:07:08  来源:igfitidea点击:

Conditionally fill column values based on another columns value in pandas

dataframepandas

提问by Jan Willem Tulp

I have a DataFramewith a few columns. One columns contains a symbol for which currency is being used, for instance a euro or a dollar sign. Another column contains a budget value. So for instance in one row it could mean a budget of 5000 in euro and in the next row it could say a budget of 2000 in dollar.

我有DataFrame几列。一列包含使用货币的符号,例如欧元或美元符号。另一列包含预算值。因此,例如,在一行中,它可能表示 5000 欧元的预算,而在下一行中可能表示 2000 美元的预算。

In pandas I would like to add an extra column to my DataFrame, normalizing the budgets in euro. So basically, for each row the value in the new column should be the value from the budget column * 1 if the symbol in the currency column is a euro sign, and the value in the new column should be the value of the budget column * 0.78125 if the symbol in the currency column is a dollar sign.

在 Pandas 中,我想在我的 DataFrame 中添加一个额外的列,以欧元标准化预算。所以基本上,对于每一行,如果货币列中的符号是欧元符号,则新列中的值应该是预算列中的值 * 1,而新列中的值应该是预算列中的值 *如果货币列中的符号是美元符号,则为 0.78125。

I know how to add a column, fill it with values, copy values from another column etc. but not how to fill the new column conditionally based on the value of another column.

我知道如何添加一列,用值填充它,从另一列复制值等,但不知道如何根据另一列的值有条件地填充新列。

Any suggestions?

有什么建议?

回答by Wes McKinney

You probably want to do

你可能想做

df['Normalized'] = np.where(df['Currency'] == '$', df['Budget'] * 0.78125, df['Budget'])

回答by Thomas Kimber

Similar results via an alternate style might be to write a function that performs the operation you want on a row, using row['fieldname']syntax to access individual values/columns, and then perform a DataFrame.applymethod upon it

通过替代样式的类似结果可能是编写一个函数来在行上执行您想要的操作,使用row['fieldname']语法访问单个值/列,然后对其执行DataFrame.apply方法

This echoes the answer to the question linked here: pandas create new column based on values from other columns

这与此处链接的问题的答案相呼应:pandas create new column based on values from other columns

def normalise_row(row):
    if row['Currency'] == '$'
    ...
    ...
    ...
    return result

df['Normalized'] = df.apply(lambda row : normalise_row(row), axis=1) 

回答by Artem Yevtushenko

Taking Tom Kimber's suggestion one step further, you could use a Function Dictionary to set various conditions for your functions. This solution is expanding the scope of the question.

将 Tom Kimber 的建议更进一步,您可以使用函数字典为您的函数设置各种条件。该解决方案正在扩大问题的范围。

I'm using an example from a personal application.

我正在使用个人应用程序中的示例。

# write the dictionary

def applyCalculateSpend (df_name, cost_method_col, metric_col, rate_col, total_planned_col):
    calculations = {
            'CPMV'  : df_name[metric_col] / 1000 * df_name[rate_col],
            'Free'  : 0
            }
    df_method = df_name[cost_method_col]
    return calculations.get(df_method, "not in dict")

# call the function inside a lambda

test_df['spend'] = test_df.apply(lambda row: applyCalculateSpend(
row,
cost_method_col='cost method',
metric_col='metric',
rate_col='rate',
total_planned_col='total planned'), axis = 1)

  cost method  metric  rate  total planned  spend
0        CPMV    2000   100           1000  200.0
1        CPMV    4000   100           1000  400.0
4        Free       1     2              3    0.0