根据python中的条件获取两列pandas数据框之间的差异

Question

提问by User1090

I have a dataframe named pricecomp_df, I want to take compare the price of column "market price" and each of the other columns like "apple price","mangoes price", "watermelon price" but prioritize the difference based on the condition : (First priority is watermelon price, second to mangoes and third for apple). The input dataframe is given below:

我有一个名为 pricecomp_df 的数据框，我想比较“市场价格”列和“苹果价格”、“芒果价格”、“西瓜价格”等其他每一列的价格，但根据条件优先考虑差异：（第一优先是西瓜价格，其次是芒果，第三是苹果）。输入数据框如下：

   code  apple price  mangoes price  watermelon price  market price
0   101          101            NaN               NaN           122
1   102          123            123               NaN           124
2   103          NaN            NaN               NaN           123
3   105          123            167               NaN           154
4   107          165            NaN               177           176
5   110          123            NaN               NaN           123

So here the first row has just apple price and market price then take their diff, but in second row, we have apple, mangoes price so i have to take only the difference between market price and mangoes price. likewise take the difference based on priority condition. Also skip the rows with nan for all three prices. Can anyone help on this?

所以这里第一行只有苹果价格和市场价格，然后取它们的差异，但在第二行，我们有苹果和芒果的价格，所以我只需要考虑市场价格和芒果价格之间的差异。同样根据优先级条件取差值。对于所有三个价格，也跳过带有 nan 的行。任何人都可以帮忙吗？

Answer 1

回答by MERose

Hope I'm not too late. The idea is to calculate the differences and overwrite them according to your priority list.

希望我还不算太晚。这个想法是计算差异并根据您的优先级列表覆盖它们。

import numpy as np
import pandas as pd

df = pd.DataFrame({'code': [101, 102, 103, 105, 107, 110],
                   'apple price': [101, 123, np.nan, 123, 165, 123],
                   'mangoes price': [np.nan, 123, np.nan, 167, np.nan, np.nan],
                   'watermelon price': [np.nan, np.nan, np.nan, np.nan, 177, np.nan],
                   'market price': [122, 124, 123, 154, 176, 123]})

# Calculate difference to apple price
df['diff'] = df['market price'] - df['apple price']
# Overwrite with difference to mangoes price
df['diff'] = df.apply(lambda x: x['market price'] - x['mangoes price'] if not np.isnan(x['mangoes price']) else x['diff'], axis=1)
# Overwrite with difference to watermelon price
df['diff'] = df.apply(lambda x: x['market price'] - x['watermelon price'] if not np.isnan(x['watermelon price']) else x['diff'], axis=1)

print df
   apple price  code  mangoes price  market price  watermelon price  diff
0          101   101            NaN           122               NaN    21
1          123   102            123           124               NaN     1
2          NaN   103            NaN           123               NaN   NaN
3          123   105            167           154               NaN   -13
4          165   107            NaN           176               177    -1
5          123   110            NaN           123               NaN     0

根据python中的条件获取两列pandas数据框之间的差异

提问by User1090

回答by MERose

相关推荐

最近更新

标签

根据python中的条件获取两列pandas数据框之间的差异

提问by User1090

回答by MERose

相关推荐

python 3 try-except all with error

Python 如何在保留列顺序的同时创建 DataFrame？

Python 导入错误：libSM.so.6：无法打开共享对象文件：没有这样的文件或目录

Python 如何找到运行我的代码的 conda 环境的名称？

相关推荐

最近更新

标签