pandas 如何在熊猫中做两个数据帧的矩阵乘积？

Question

提问by Spandyie

I am very new to Python having recently migrated from Matlab. Is there a command in Python (Pandas or Numpy) that does Matlab like matrix multiplication of two dataframes created using Pandas?

我最近从 Matlab 迁移过来，对 Python 非常陌生。Python（Pandas 或 Numpy）中是否有一个命令可以让 Matlab 像使用 Pandas 创建的两个数据帧的矩阵乘法一样？

Answer 1

采纳答案by Alexander

Use dot:

使用dot：

import numpy as np
import pandas as pd

np.random.seed(0)

# Numpy
m1 = np.random.randn(5, 5)
m2 = np.random.randn(5, 5)

>>> m1.dot(m2)
array([[ -5.51837355,  -4.08559942,  -1.88020209,   2.88961281,
          0.61755013],
       [  1.4732264 ,  -0.2394676 ,  -0.34717755,  -4.18527913,
         -1.75550855],
       [ -0.1871964 ,   0.76399007,  -0.26550057,  -3.43359244,
         -0.68081106],
       [ -0.23996774,   0.95331428,  -2.833788  ,  -0.37940614,
          0.05464387],
       [  3.73328914,  -0.59578959,   3.96803224, -10.65362381,
         -4.34460348]])

# Pandas
df1 = pd.DataFrame(m1)
df2 = pd.DataFrame(m2)

>>> df1.dot(df2)
          0         1         2          3         4
0 -5.518374 -4.085599 -1.880202   2.889613  0.617550
1  1.473226 -0.239468 -0.347178  -4.185279 -1.755509
2 -0.187196  0.763990 -0.265501  -3.433592 -0.680811
3 -0.239968  0.953314 -2.833788  -0.379406  0.054644
4  3.733289 -0.595790  3.968032 -10.653624 -4.344603

df3 = pd.DataFrame(np.random.randn(5, 3))
df4 = pd.DataFrame(np.random.randn(3, 5))

>>> df3.dot(df4)
          0         1         2         3         4
0  0.991673  1.954500  0.322110  0.493841  0.080462
1  0.160482  1.548039 -0.826426  0.972538 -0.048610
2  0.628194  0.482943  0.742597 -0.236226  0.089525
3 -0.098316  0.817702 -0.725945  1.271506 -0.309596
4 -1.053413  0.948427 -2.445940  2.814147 -0.726829

Answer 2

回答by Anton Protopopov

Alternatively to the well known dotfunction you could use numpy.matmulif you have numpy version >= 1.10.0:

除了众所周知的dot功能，如果您有 numpy 版本 >= ，您可以使用numpy.matmul1.10.0：

import numpy as np
import pandas as pd

np.random.seed(632)
df1 = pd.DataFrame(np.random.randn(7, 7))
df2 = pd.DataFrame(np.random.randn(7, 7))

In [68]: np.matmul(df1, df2)
Out[68]: 
array([[ 0.08535756, -3.05102895,  3.26148284, -6.27736384, -1.52042691,
         2.40667207, -0.6385153 ],
       [ 5.29731049, -0.94033606, -0.12675555,  1.10453597, -1.70722837,
         2.57797682,  2.37629556],
       [ 0.31841755, -1.46897738, -0.22734008, -4.37852181, -0.98948844,
         3.49939092, -1.36656608],
       [ 0.90757446, -4.6364365 ,  1.86254589, -4.89078986,  0.31928714,
         2.3442364 , -2.29896007],
       [-1.14428758,  6.69735827, -3.8776982 ,  6.87574565,  1.38854952,
        -2.88767356,  1.46302112],
       [ 0.8771236 , -2.01941938,  1.03461007,  0.30331467,  2.39161032,
         0.07345672, -1.30557339],
       [ 0.94310211, -0.54294898,  2.46147932, -3.21588748, -2.98369364,
         3.73941015,  1.31782966]])

Performance almost the same:

性能几乎相同：

In [71]: %timeit np.dot(df1, df2)
10000 loops, best of 3: 63.7 μs per loop

In [73]: %timeit np.matmul(df1, df2)
10000 loops, best of 3: 64.2 μs per loop

But better then using df1.dot(df2):

但更好的是使用df1.dot(df2)：

In [82]: %timeit df1.dot(df2)
1000 loops, best of 3: 217 μs per loop

pandas 如何在熊猫中做两个数据帧的矩阵乘积？

提问by Spandyie

采纳答案by Alexander

回答by Anton Protopopov

相关推荐

最近更新

标签

pandas 如何在熊猫中做两个数据帧的矩阵乘积？

提问by Spandyie

采纳答案by Alexander

回答by Anton Protopopov

相关推荐

pandas 禁用 Pylint no member-特定库的 E1101 错误

pandas IndexError：访问pandas.DataFrame时索引越界

使用 Python 将 Pandas 数据帧中的行作为单个文档插入到 mongodb 集合中

使用 Psycopg2 将 Pandas DataFrame 转换为 PostgreSQL

相关推荐

最近更新

标签