pandas 如何从另一个数据帧中减去一个数据帧?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/14946494/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 20:39:27  来源:igfitidea点击:

How to subtract one dataframe from another?

pandas

提问by kjo

First, let me set the stage.

首先,让我设置舞台。

I start with a pandasdataframe klmn, that looks like this:

我从一个pandasdataframe开始klmn,它看起来像这样:

In [15]: klmn
Out[15]: 
    K  L         M   N
0   0  a -1.374201  35
1   0  b  1.415697  29
2   0  a  0.233841  18
3   0  b  1.550599  30
4   0  a -0.178370  63
5   0  b -1.235956  42
6   0  a  0.088046   2
7   0  b  0.074238  84
8   1  a  0.469924  44
9   1  b  1.231064  68
10  2  a -0.979462  73
11  2  b  0.322454  97

Next I split klmninto two dataframes, klmn0and klmn1, according to the value in the 'K' column:

接下来,我分裂klmn成两个dataframes,klmn0并且klmn1,根据在“K”列中的值:

In [16]: k0 = klmn.groupby(klmn['K'] == 0)
In [17]: klmn0, klmn1 = [klmn.ix[k0.indices[tf]] for tf in (True, False)]
In [18]: klmn0, klmn1
Out[18]: 
(   K  L         M   N
0  0  a -1.374201  35
1  0  b  1.415697  29
2  0  a  0.233841  18
3  0  b  1.550599  30
4  0  a -0.178370  63
5  0  b -1.235956  42
6  0  a  0.088046   2
7  0  b  0.074238  84,
     K  L         M   N
8   1  a  0.469924  44
9   1  b  1.231064  68
10  2  a -0.979462  73
11  2  b  0.322454  97)

Finally, I compute the mean of the Mcolumn in klmn0, grouped by the value in the Lcolumn:

最后,我计算 中M列的平均值,按列中klmn0的值分组L

In [19]: m0 = klmn0.groupby('L')['M'].mean(); m0
Out[19]: 
L
a   -0.307671
b    0.451144
Name: M

Now, my question is, how can I subtract m0from the Mcolumn of the klmn1sub-dataframe, respecting the value in the Lcolumn?(By this I mean that m0['a']gets subtracted from the Mcolumn of each row in klmn1that has 'a'in the Lcolumn, and likewise for m0['b'].)

现在,我的问题是,如何m0从子数据框的M列中减去klmn1,尊重L列中的值?(通过这我的意思是m0['a']被从减去M每行中列klmn1'a'L列,同样的m0['b']。)

One could imagine doing this in a way that replaces the the values in the Mcolumn of klmn1with the new values (after subtracting the value from m0). Alternatively, one could imagine doing this in a way that leaves klmn1unchanged, and instead produces a new dataframe klmn11with an updated Mcolumn. I'm interested in both approaches.

可以想象这样做的方式是将M列中的值替换klmn1为新值(从 中减去值后m0)。或者,人们可以想象以一种klmn1保持不变的方式执行此操作,而是生成一个klmn11带有更新M列的新数据帧。 我对这两种方法都感兴趣。

回答by Zelazny7

If you reset the index of your klmn1 dataframe to be that of the column L, then your dataframe will automatically align the indices with any series you subtract from it:

如果您将 klmn1 数据帧的索引重置为 L 列的索引,那么您的数据帧将自动将索引与您从中减去的任何系列对齐:

In [1]: klmn1.set_index('L')['M'] - m0
Out[1]:
L
a    0.777595
a   -0.671791
b    0.779920
b   -0.128690
Name: M

回答by Sohel Khan

Option #1:

选项1:

df1.subtract(df2, fill_value=0) 

Option #2:

选项#2:

df1.subtract(df2, fill_value=None)