pandas ValueError:只能比较相同标记的系列对象python
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/51067449/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
ValueError: Can only compare identically-labeled Series objects python
提问by Sumukh
df = df1.loc[df1['CUST_ACCT_KEY'] != df2['CUST_ACCT_KEY']]
When I execute the above command, I get the following error:
当我执行上述命令时,出现以下错误:
ValueError: Can only compare identically-labeled Series objects
ValueError:只能比较标记相同的系列对象
What am I doing wrong?*
我究竟做错了什么?*
The dtypes of both the column are int64
.
两列的 dtypes 都是int64
.
回答by Scott Boston
Pandas
does almost all of its operations with intrinsic data alignment, meaning it uses indexes to compare, and perform operations.
Pandas
几乎所有的操作都使用内在的数据对齐,这意味着它使用索引来比较和执行操作。
You could avoid this error by converting one of the series to a numpy
array using .values
:
您可以通过numpy
使用.values
以下命令将系列之一转换为数组来避免此错误:
df = df1.loc[df1['CUST_ACCT_KEY'] != df2['CUST_ACCT_KEY']].values
However, you are comparing row to row with no index alignment.
但是,您是在没有索引对齐的情况下比较行与行。
MCVE:
MCVE:
df1 = pd.DataFrame(np.arange(1,10), index=np.arange(1,10),columns=['A'])
df2 = pd.DataFrame(np.arange(11,20), index=np.arange(11,20),columns=['B'])
df1['A'] != df2['B']
Output:
输出:
ValueError: Can only compare identically-labeled Series objects
Change to numpy array:
更改为 numpy 数组:
df1['A'] != df2['B'].values
Output:
输出:
1 True
2 True
3 True
4 True
5 True
6 True
7 True
8 True
9 True
Name: A, dtype: bool