pandas 熊猫数据框乘以一个系列

Question

提问by jianpan

What is the best way to multiply all the columns of a Pandas DataFrameby a column vector stored in a Series? I used to do this in Matlab with repmat(), which doesn't exist in Pandas. I can use np.tile(), but it looks ugly to convert the data structure back and forth each time.

将 PandasDataFrame的所有列乘以存储在 a 中的列向量的最佳方法是Series什么？我曾经在 Matlab 中使用repmat()，它在 Pandas 中不存在。我可以使用np.tile()，但每次来回转换数据结构看起来很难看。

Thanks.

谢谢。

Answer 1

回答by Wes McKinney

What's wrong with

怎么了

result = dataframe.mul(series, axis=0)

?

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.mul.html#pandas.DataFrame.mul

Answer 2

回答by spencerlyon2

This can be accomplished quite simply with the DataFrame method apply.

这可以通过 DataFrame 方法非常简单地完成apply。

In[1]: import pandas as pd; import numpy as np

In[2]: df = pd.DataFrame(np.arange(40.).reshape((8, 5)), columns=list('abcde')); df
Out[2]: 
        a   b   c   d   e
    0   0   1   2   3   4
    1   5   6   7   8   9
    2  10  11  12  13  14
    3  15  16  17  18  19
    4  20  21  22  23  24
    5  25  26  27  28  29
    6  30  31  32  33  34
    7  35  36  37  38  39

In[3]: ser = pd.Series(np.arange(8) * 10); ser
Out[3]: 
    0     0
    1    10
    2    20
    3    30
    4    40
    5    50
    6    60
    7    70

Now that we have our DataFrameand Serieswe need a function to pass to apply.

现在我们有了我们的DataFrame，Series我们需要一个函数来传递给apply.

In[4]: func = lambda x: np.asarray(x) * np.asarray(ser)

We can pass this to df.applyand we are good to go

我们可以把它传递给df.apply我们，我们很高兴去

In[5]: df.apply(func)
Out[5]:
          a     b     c     d     e
    0     0     0     0     0     0
    1    50    60    70    80    90
    2   200   220   240   260   280
    3   450   480   510   540   570
    4   800   840   880   920   960
    5  1250  1300  1350  1400  1450
    6  1800  1860  1920  1980  2040
    7  2450  2520  2590  2660  2730

df.applyacts column-wise by default, but it can can also act row-wise by passing axis=1as an argument to apply.

df.apply默认情况下按列操作，但它也可以通过axis=1作为参数传递给apply.

In[6]: ser2 = pd.Series(np.arange(5) *5); ser2
Out[6]: 
    0     0
    1     5
    2    10
    3    15
    4    20

In[7]: func2 = lambda x: np.asarray(x) * np.asarray(ser2)

In[8]: df.apply(func2, axis=1)
Out[8]: 
       a    b    c    d    e
    0  0    5   20   45   80
    1  0   30   70  120  180
    2  0   55  120  195  280
    3  0   80  170  270  380
    4  0  105  220  345  480
    5  0  130  270  420  580
    6  0  155  320  495  680
    7  0  180  370  570  780

This could be done more concisely by defining the anonymous function inside apply

这可以通过在内部定义匿名函数来更简洁地完成 apply

In[9]: df.apply(lambda x: np.asarray(x) * np.asarray(ser))
Out[9]: 
          a     b     c     d     e
    0     0     0     0     0     0
    1    50    60    70    80    90
    2   200   220   240   260   280
    3   450   480   510   540   570
    4   800   840   880   920   960
    5  1250  1300  1350  1400  1450
    6  1800  1860  1920  1980  2040
    7  2450  2520  2590  2660  2730

In[10]: df.apply(lambda x: np.asarray(x) * np.asarray(ser2), axis=1)
Out[10]:
       a    b    c    d    e
    0  0    5   20   45   80
    1  0   30   70  120  180
    2  0   55  120  195  280
    3  0   80  170  270  380
    4  0  105  220  345  480
    5  0  130  270  420  580
    6  0  155  320  495  680
    7  0  180  370  570  780

Answer 3

回答by Andy Hayden

Why not create your own dataframe tile function:

为什么不创建自己的数据框平铺功能：

def tile_df(df, n, m):
    dfn = df.T
    for _ in range(1, m):
        dfn = dfn.append(df.T, ignore_index=True)
    dfm = dfn.T
    for _ in range(1, n):
        dfm = dfm.append(dfn.T, ignore_index=True)
    return dfm

Example:

例子：

df = pandas.DataFrame([[1,2],[3,4]])
tile_df(df, 2, 3)
#    0  1  2  3  4  5
# 0  1  2  1  2  1  2
# 1  3  4  3  4  3  4
# 2  1  2  1  2  1  2
# 3  3  4  3  4  3  4

However, the docsnote: "DataFrame is not intended to be a drop-in replacement for ndarray as its indexing semantics are quite different in places from a matrix."Which presumably should be interpreted as "use numpy if you are doing lots of matrix stuff".

但是，文档指出：“DataFrame 并不打算直接替代 ndarray，因为它的索引语义在某些地方与矩阵完全不同。” 这大概应该被解释为“如果你正在做很多矩阵的东西，请使用 numpy”。

pandas 熊猫数据框乘以一个系列

提问by jianpan

回答by Wes McKinney

回答by spencerlyon2

回答by Andy Hayden

Example:

例子：

相关推荐

最近更新

标签

pandas 熊猫数据框乘以一个系列

提问by jianpan

回答by Wes McKinney

回答by spencerlyon2

回答by Andy Hayden

Example:

例子：

相关推荐

wpf 在 DataGridTextColumn 中为 TextBlock 创建样式

wpf wpf的棱镜与mvvm灯

wpf 如何获得 Blend for Visual Studio 2013 Express

wpf ScrollViewer 上的动画（平滑）滚动

相关推荐

最近更新

标签