Python 熊猫中的矩阵乘法
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/16472729/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Matrix multiplication in pandas
提问by
I have numeric data stored in two DataFrames x and y. The inner product from numpy works but the dot product from pandas does not.
我将数字数据存储在两个 DataFrames x 和 y 中。numpy 的内积有效,但 Pandas 的点积无效。
In [63]: x.shape
Out[63]: (1062, 36)
In [64]: y.shape
Out[64]: (36, 36)
In [65]: np.inner(x, y).shape
Out[65]: (1062L, 36L)
In [66]: x.dot(y)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-66-76c015be254b> in <module>()
----> 1 x.dot(y)
C:\Programs\WinPython-64bit-2.7.3.3\python-2.7.3.amd64\lib\site-packages\pandas\core\frame.pyc in dot(self, other)
888 if (len(common) > len(self.columns) or
889 len(common) > len(other.index)):
--> 890 raise ValueError('matrices are not aligned')
891
892 left = self.reindex(columns=common, copy=False)
ValueError: matrices are not aligned
Is this a bug or am I using pandas wrong?
这是一个错误还是我使用熊猫错误?
采纳答案by unutbu
Not only must the shapes of xand ybe correct, but also
the column names of xmust match the index names of y. Otherwise
this code in pandas/core/frame.pywill raise a ValueError:
不仅x和的形状y必须正确,而且 的列名x必须与 的索引名称匹配y。否则,此代码pandas/core/frame.py将引发 ValueError:
if isinstance(other, (Series, DataFrame)):
common = self.columns.union(other.index)
if (len(common) > len(self.columns) or
len(common) > len(other.index)):
raise ValueError('matrices are not aligned')
If you just want to compute the matrix product without making the column names of xmatch the index names of y, then use the NumPy dot function:
如果您只想计算矩阵乘积而不使 的列名x与 的索引名匹配y,则使用 NumPy 点函数:
np.dot(x, y)
The reason why the column names of xmust match the index names of yis because the pandas dotmethod will reindex xand yso that if the column order of xand the index order of ydo not naturally match, they will be made to match before the matrix product is performed:
的列名x必须匹配的索引名的y原因是因为pandasdot方法会重新索引x,y所以如果列的顺序x和索引的顺序y不自然匹配,它们会在执行矩阵乘积之前匹配:
left = self.reindex(columns=common, copy=False)
right = other.reindex(index=common, copy=False)
The NumPy dotfunction does no such thing. It will just compute the matrix product based on the values in the underlying arrays.
NumPydot函数没有做这样的事情。它只会根据底层数组中的值计算矩阵乘积。
Here is an example which reproduces the error:
这是一个重现错误的示例:
import pandas as pd
import numpy as np
columns = ['col{}'.format(i) for i in range(36)]
x = pd.DataFrame(np.random.random((1062, 36)), columns=columns)
y = pd.DataFrame(np.random.random((36, 36)))
print(np.dot(x, y).shape)
# (1062, 36)
print(x.dot(y).shape)
# ValueError: matrices are not aligned

