Python 熊猫“只能比较相同标记的数据帧对象”错误

Question

提问by user1804633

I'm using Pandas to compare the outputs of two files loaded into two data frames (uat, prod): ...

我正在使用 Pandas 来比较加载到两个数据帧（uat、prod）中的两个文件的输出：...

uat = uat[['Customer Number','Product']]
prod = prod[['Customer Number','Product']]
print uat['Customer Number'] == prod['Customer Number']
print uat['Product'] == prod['Product']
print uat == prod

The first two match exactly:
74357    True
74356    True
Name: Customer Number, dtype: bool
74357    True
74356    True
Name: Product, dtype: bool

For the third print, I get an error: Can only compare identically-labeled DataFrame objects. If the first two compared fine, what's wrong with the 3rd?

对于第三次打印，我收到一个错误：只能比较标记相同的 DataFrame 对象。如果前两个比较好，那么第三个有什么问题？

Thanks

谢谢

Answer 1

采纳答案by Andy Hayden

Here's a small example to demonstrate this (which only applied to DataFrames, not Series, until Pandas 0.19 where it applies to both):

这是一个演示这一点的小示例（它仅适用于 DataFrames，而不适用于 Series，直到 Pandas 0.19 适用于两者）：

In [1]: df1 = pd.DataFrame([[1, 2], [3, 4]])

In [2]: df2 = pd.DataFrame([[3, 4], [1, 2]], index=[1, 0])

In [3]: df1 == df2
Exception: Can only compare identically-labeled DataFrame objects

One solution is to sort the indexfirst (Note: some functions require sorted indexes):

一种解决方案是先对索引进行排序（注意：有些函数需要排序索引）：

In [4]: df2.sort_index(inplace=True)

In [5]: df1 == df2
Out[5]: 
      0     1
0  True  True
1  True  True

Note: ==is also sensitive to the order of columns, so you may have to use sort_index(axis=1):

注意：对列的顺序==也很敏感，因此您可能必须使用sort_index(axis=1)：

In [11]: df1.sort_index().sort_index(axis=1) == df2.sort_index().sort_index(axis=1)
Out[11]: 
      0     1
0  True  True
1  True  True

Note: This can still raise (if the index/columns aren't identically labelled after sorting).

注意：这仍然可以引发（如果排序后索引/列的标签不同）。

Answer 2

回答by CoreDump

You can also try dropping the index column if it is not needed to compare:

如果不需要比较，您也可以尝试删除索引列：

print(df1.reset_index(drop=True) == df2.reset_index(drop=True))

I have used this same technique in a unit test like so:

我在单元测试中使用了同样的技术，如下所示：

from pandas.util.testing import assert_frame_equal

assert_frame_equal(actual.reset_index(drop=True), expected.reset_index(drop=True))

Python 熊猫“只能比较相同标记的数据帧对象”错误

提问by user1804633

采纳答案by Andy Hayden

回答by CoreDump

相关推荐

最近更新

标签

Python 熊猫“只能比较相同标记的数据帧对象”错误

提问by user1804633

采纳答案by Andy Hayden

回答by CoreDump

相关推荐

Python 在 numpy 数组中相乘

Python 从 Pandas 系列中删除零行

Python 漂亮地打印熊猫数据框

Python 类型错误：序列项 0：预期的 str 实例，找到的字节数

相关推荐

最近更新

标签