Python 减去数据框中的两列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/48350850/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 18:40:19  来源:igfitidea点击:

Subtract two columns in dataframe

pythonpandasdataframe

提问by Peter

My df looks as follows:

我的 df 如下所示:

Index    Country    Val1  Val2 ... Val10
1        Australia  1     3    ... 5
2        Bambua     12    33   ... 56
3        Tambua     14    34   ... 58

I'd like to substract Val10 from Val1 for each country, so output looks like:

我想为每个国家/地区从 Val1 中减去 Val10,因此输出如下所示:

Country    Val10-Val1
Australia  4
Bambua     23
Tambua     24

So far I've got:

到目前为止,我有:

def myDelta(row):
    data = row[['Val10', 'Val1']]
    return pd.Series({'Delta': np.subtract(data)})

def runDeltas():
    myDF = getDF() \
        .apply(myDelta, axis=1) \
        .sort_values(by=['Delta'], ascending=False)
    return myDF

runDeltas results in this error:

runDeltas 导致此错误:

ValueError: ('invalid number of arguments', u'occurred at index 9')

What's the proper way to fix this?

解决这个问题的正确方法是什么?

采纳答案by Alberto Chiusole

Given the following dataframe:

给定以下数据框:

df = pd.DataFrame([["Australia", 1, 3, 5],
                   ["Bambua", 12, 33, 56],
                   ["Tambua", 14, 34, 58]
                  ], columns=["Country", "Val1", "Val2", "Val10"]
                 )

It comes down to a simple broadcasting operation:

归结为一个简单的广播操作

>>> val1_minus_val10 = df["Val1"] - df["Val10"]
>>> print(val1_minus_val10)
0    -4
1   -44
2   -44
dtype: int64

回答by Henry Owens

Using this as the df:

使用它作为 df:

df = pd.DataFrame([["Australia", 1, 3, 5],
               ["Bambua", 12, 33, 56],
               ["Tambua", 14, 34, 58]
              ], columns=["Country", "Val1", "Val2", "Val10"]
             )

You can also do the subtraction and put it into a new column as follows.

您也可以进行减法并将其放入新列中,如下所示。

>>>df['Val_Diff'] = df['Val10'] - df['Val1']

    Country     Val1    Val2  Val10 Val_Diff
0   Australia   1       3      5    4
1   Bambua      12      33     56   44
2   Tambua      14      34     58   44

回答by Rishi Bansal

You can do this by using lambda function and assign to new column.

您可以通过使用 lambda 函数并分配给新列来执行此操作。

df['Val10-Val1'] = df.apply(lambda x: x['Val10'] - x['Val1'], axis=1)
print df

回答by Prayson W. Daniel

You can also use pandas.DataFrame.assignfunction: e,g

您还可以使用pandas.DataFrame.assign函数:e,g

import numpy as np
import pandas as pd

df = pd.DataFrame([["Australia", 1, 3, 5],
                   ["Bambua", 12, 33, 56],
                   ["Tambua", 14, 34, 58]
                  ], columns=["Country", "Val1", "Val2", "Val10"]
                 )

df = df.assign(Val10_minus_Val1 = df['Val10'] - df['Val1'])

The best part of assign is you can add as many assignments as you wish. e.g. getting both the difference and then the log of it

分配的最佳部分是您可以根据需要添加任意数量的作业。例如,获得差异,然后获得它的日志

df = df.assign(Val10_minus_Val1 = df['Val10'] - df['Val1'], log_result = lambda x: np.log(x.Val10_minus_Val1) )

Results: enter image description here

结果: 在此处输入图片说明

回答by Navid

What I have faced today, makes me ambitious to share it with you. As people mentioned above you can used easily:

我今天所面临的,让我雄心勃勃地与你们分享。如上所述,您可以轻松使用:

df['Val10-Val1'] = df['Val10']-df['Val1']

but sometimes you might need to use apply function, so you might use the following line:

但有时您可能需要使用 apply 函数,因此您可以使用以下行:

df['Val10-Val1'] = df.apply(lambda row: row['Val10']-row['Val1'])