Python:将 DataFrame 的每一行除以另一个 DataFrame 向量

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/22642162/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 01:20:37  来源:igfitidea点击:

Python: Divide each row of a DataFrame by another DataFrame vector

pythonpandas

提问by Plug4

I have a DataFrame (df1) with a dimension 2000 rows x 500 columns(excluding the index) for which I want to divide each row by another DataFrame (df2) with dimension 1 rows X 500 columns. Both have the same column headers. I tried:

我有一个带有维度2000 rows x 500 columns(不包括索引)的数据帧(df1),我想将每行除以另一个维度为 的数据帧(df2)1 rows X 500 columns。两者都有相同的列标题。我试过:

df.divide(df2)and df.divide(df2, axis='index')and multiple other solutions and I always get a df with nanvalues in every cell. What argument am I missing in the function df.divide?

df.divide(df2)df.divide(df2, axis='index')其他多种解决方案,我总是nan在每个单元格中得到一个带有值的 df 。我在函数中缺少什么参数df.divide

采纳答案by kimal

In df.divide(df2, axis='index'), you need to provide the axis/row of df2 (ex. df2.iloc[0]).

在 中df.divide(df2, axis='index'),您需要提供 df2 的轴/行(例如df2.iloc[0])。

import pandas as pd

data1 = {"a":[1.,3.,5.,2.],
         "b":[4.,8.,3.,7.],
         "c":[5.,45.,67.,34]}
data2 = {"a":[4.],
         "b":[2.],
         "c":[11.]}

df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2) 

df1.div(df2.iloc[0], axis='columns')

or you can use df1/df2.values[0,:]

或者你可以使用 df1/df2.values[0,:]

回答by Andy Hayden

You can divide by the seriesi.e. the first row of df2:

您可以除以系列,即 df2 的第一行:

In [11]: df = pd.DataFrame([[1., 2.], [3., 4.]], columns=['A', 'B'])

In [12]: df2 = pd.DataFrame([[5., 10.]], columns=['A', 'B'])

In [13]: df.div(df2)
Out[13]: 
     A    B
0  0.2  0.2
1  NaN  NaN

In [14]: df.div(df2.iloc[0])
Out[14]: 
     A    B
0  0.2  0.2
1  0.6  0.4

回答by etna

Small clarification just in case: the reason why you got NaN everywhere while Andy's first example (df.div(df2)) works for the first line is div tries to match indexes (and columns). In Andy's example, index 0 is found in both dataframes, so the division is made, not index 1 so a line of NaN is added. This behavior should appear even more obvious if you run the following (only the 't' line is divided):

以防万一的小说明:为什么在 Andy 的第一个示例 ( df.div(df2)) 适用于第一行时到处都是 NaN 的原因是 div 尝试匹配索引(和列)。在 Andy 的示例中,在两个数据帧中都找到了索引 0,因此进行了除法,而不是索引 1,因此添加了一行 NaN。如果您运行以下命令(仅分隔 't' 行),这种行为应该会更加明显:

df_a = pd.DataFrame(np.random.rand(3,5), index= ['x', 'y', 't'])
df_b = pd.DataFrame(np.random.rand(2,5), index= ['z','t'])
df_a.div(df_b)

So in your case, the index of the only row of df2 was apparently not present in df1. "Luckily", the column headers are the same in both dataframes, so when you slice the first row, you get a series, the index of which is composed by the column headers of df2. This is what eventually allows the division to take place properly.

因此,在您的情况下,df2 唯一行的索引显然不存在于 df1 中。“幸运的是”,两个数据框中的列标题是相同的,因此当您对第一行进行切片时,您会得到一个系列,其索引由 df2 的列标题组成。这就是最终允许分裂正确发生的原因。

For a case with index and column matching:

对于索引和列匹配的情况:

df_a = pd.DataFrame(np.random.rand(3,5), index= ['x', 'y', 't'], columns = range(5))
df_b = pd.DataFrame(np.random.rand(2,5), index= ['z','t'], columns = [1,2,3,4,5])
df_a.div(df_b)

回答by Cornel Ciobanu

If you want to divide each row of a column with a specific value you could try:

如果您想用特定值划分列的每一行,您可以尝试:

df['column_name'] = df['column_name'].div(10000)

For me, this code divided each row of 'column_name' with 10,000.

对我来说,此代码将“column_name”的每一行除以 10,000。

回答by Motoman

to divide a row (with single or multiple columns), we need to do the below:

要划分一行(具有单列或多列),我们需要执行以下操作:

df.loc['index_value'] = df.loc['index_value'].div(10000)