pandas 类型错误:“系列”对象是可变的,因此它们不能被散列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/42504442/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
TypeError: 'Series' objects are mutable, thus they cannot be hashed
提问by Mayeul sgc
I know this error is common, I tried some solutions I looked up and still can't understand what is wrong. I guess it is due to the mutable form of row and row1, but i can't figure it out
我知道这个错误很常见,我尝试了一些我查过的解决方案,但仍然不明白出了什么问题。我想这是由于 row 和 row1 的可变形式,但我想不通
What am I trying to do ? I have 2 dataframes. I need to iterate over the rows of the first 1, and for each line of the first one iterate through the second and check the value of the cell for some columns. My code and different attempts :
我想做什么?我有 2 个数据框。我需要遍历第一个 1 的行,并且对于第一个的每一行遍历第二个并检查某些列的单元格的值。我的代码和不同的尝试:
a=0
b=0
for row in Correction.iterrows():
b+=1
for row1 in dataframe.iterrows():
c+=1
a=0
print('Handling correction '+str(b)+' and deal '+str(c))
if (Correction.loc[row,['BO Branch Code']]==dataframe.loc[row1,['wings Branch']] and Correction.loc[row,['Profit Center']]==dataframe.loc[row1,['Profit Center']] and Correction.loc[row,['Back Office']]==dataframe.loc[row1,['Back Office']]
and Correction.loc[row,['BO System Code']]==dataframe.loc[row1,['BO System Code']]):
I also tried
我也试过
a=0
b=0
for row in Correction.iterrows():
b+=1
for row1 in dataframe.iterrows():
c+=1
a=0
print('Handling correction '+str(b)+' and deal '+str(c))
if (Correction[row]['BO Branch Code']==dataframe[row1]['wings Branch'] and Correction[row]['Profit Center']==dataframe[row1]['Profit Center'] and Correction[row]['Back Office']==dataframe[row1]['Back Office']
and Correction[row]['BO System Code']==dataframe[row1]['BO System Code']):
And
和
a=0
b=0
for row in Correction.iterrows():
b+=1
for row1 in dataframe.iterrows():
c+=1
a=0
print('Handling correction '+str(b)+' and deal '+str(c))
if (Correction.loc[row,['BO Branch Code']]==dataframe[row1,['wings Branch']] and Correction[row,['Profit Center']]==dataframe[row1,['Profit Center']] and Correction[row,['Back Office']]==dataframe[row1,['Back Office']]
and Correction[row,['BO System Code']]==dataframe[row1,['BO System Code']]):
回答by Mayeul sgc
I found a way around by changing my for loop now my code is :
我通过更改 for 循环找到了解决方法,现在我的代码是:
a=0
b=0
for index in Correction.index:
b+=1
for index1 in dataframe.index:
c+=1
a=0
print('Handling correction '+str(b)+' and deal '+str(c))
if (Correction.loc[row,'BO Branch Code']==dataframe.loc[row1,'Wings Branch]] and Correction.loc[row,'Profit Center']==dataframe.loc[row1,'Profit Center'] and Correction.loc[row,'Back Office']==dataframe.loc[row1,'Back Office']
and Correction.loc[row,'BO System Code']==dataframe.loc[row1,'BO System Code']):
回答by Vikash Singh
I think you are iterating your df wrong
我认为你在迭代你的 df 错误
for row in Correction.itertuples():
bo_branch_code = row['BO Branch Code']
for row1 in dataframe.itertuples():
if row1['wings Branch'] == bo_branch_code:
# do stuff here
reference how to iterate DataFrame: https://github.com/vi3k6i5/pandas_basics/blob/master/2.A%20Iterate%20over%20a%20dataframe.ipynb
参考如何迭代 DataFrame:https: //github.com/vi3k6i5/pandas_basics/blob/master/2.A%20Iterate%20over%20a%20dataframe.ipynb
I timed your index approach and iteraterows approach. Here are the results:
我为您的索引方法和 iteraterows 方法计时。结果如下:
import pandas as pd
import numpy as np
import time
df = pd.DataFrame(np.random.randint(0,100,size=(10, 4)), columns=list('ABCD'))
df_2 = pd.DataFrame(np.random.randint(0,100,size=(10, 4)), columns=list('ABCD'))
def test_time():
for index in df.index:
for index1 in df_2.index:
if (df.loc[index, 'A'] == df_2.loc[index1, 'A']):
continue
def test_time_2():
for idx, row in df.iterrows():
a_val = row['A']
for idy, row_1 in df_2.iterrows():
if (a_val == row_1['A']):
continue
start= time.clock()
test_time()
end= time.clock()
print(end-start)
# 0.038514999999999855
start= time.clock()
test_time_2()
end= time.clock()
print(end-start)
# 0.009272000000000169
Simply saying iterrows is way faster than your approach.
简单地说 iterrows 比你的方法快得多。
Reference on good approaches to loop over a dataframe What is the most efficient way to loop through dataframes with pandas?
关于循环数据帧的好方法的参考 使用 Pandas 循环数据帧的最有效方法是什么?