ValueError:无法使用 isin 和 pandas 从重复轴重新索引

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/30788061/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 23:27:44  来源:igfitidea点击:

ValueError: cannot reindex from a duplicate axis using isin with pandas

pythonpandasdataframe

提问by icomefromchaos

I am trying to short zipcodes into various files but I keep getting

我试图将邮政编码缩短到各种文件中,但我不断收到

ValueError: cannot reindex from a duplicate axis

ValueError:无法从重复的轴重新索引

I've read through other documentation on Stackoverflow, but I haven't been about to figure out why its duplicating axis.

我已经阅读了有关 Stackoverflow 的其他文档,但我还没有弄清楚为什么它会复制轴。

import csv
import pandas as pd
from pandas import DataFrame as df
fp = '/Users/User/Development/zipcodes/file.csv'
file1 = open(fp, 'rb').read()
df = pd.read_csv(fp, sep=',')

df = df[['VIN', 'Reg Name', 'Reg Address', 'Reg City', 'Reg ST', 'ZIP',
         'ZIP', 'Catagory', 'Phone', 'First Name', 'Last Name', 'Reg NFS',
         'MGVW', 'Make', 'Veh Model','E Mfr', 'Engine Model', 'CY2010',
         'CY2011', 'CY2012', 'CY2013', 'CY2014', 'CY2015', 'Std Cnt', 
        ]]
#reader.head(1)
df.head(1)
zipBlue = [65355, 65350, 65345, 65326, 65335, 64788, 64780, 64777, 64743,
64742, 64739, 64735, 64723, 64722, 64720]

Also contains zipGreen, zipRed, zipYellow, ipLightBlueBut did not include in example.

也包含zipGreen, zipRed, zipYellow, ipLightBlue但没有包含在示例中。

def IsInSort():
    blue = df[df.ZIP.isin(zipBlue)]
    green = df[df.ZIP.isin(zipGreen)]
    red = df[df.ZIP.isin(zipRed)]
    yellow = df[df.ZIP.isin(zipYellow)]
    LightBlue = df[df.ZIP.isin(zipLightBlue)]
def SaveSortedZips():
    blue.to_csv('sortedBlue.csv')
    green.to_csv('sortedGreen.csv')
    red.to_csv('sortedRed.csv')
    yellow.to_csv('sortedYellow.csv')
    LightBlue.to_csv('SortedLightBlue.csv')
IsInSort()
SaveSortedZips()

1864 # trying to reindex on an axis with duplicates 1865
if not self.is_unique and len(indexer): -> 1866 raise ValueError("cannot reindex from a duplicate axis") 1867 1868 def reindex(self, target, method=None, level=None, limit=None):

ValueError: cannot reindex from a duplicate axis

1864 # 尝试在一个有重复的轴上重新索引 1865
如果不是 self.is_unique 和 len(indexer): -> 1866 raise ValueError("cannot reindex from a重复轴") 1867 1868 def reindex(self, target, method=None,级别=无,限制=无):

ValueError:无法从重复的轴重新索引

回答by firelynx

I'm pretty sure your problem is related to your mask

我很确定你的问题与你的面具有关

df = df[['VIN', 'Reg Name', 'Reg Address', 'Reg City', 'Reg ST', 'ZIP',
         'ZIP', 'Catagory', 'Phone', 'First Name', 'Last Name', 'Reg NFS',
         'MGVW', 'Make', 'Veh Model','E Mfr', 'Engine Model', 'CY2010',
         'CY2011', 'CY2012', 'CY2013', 'CY2014', 'CY2015', 'Std Cnt', 
        ]]

'ZIP'is in there twice. Removing one of them should solve the problem.

'ZIP'在那里两次。删除其中之一应该可以解决问题。

The error ValueError: cannot reindex from a duplicate axisis one of these very very cryptic pandas errorswhich simply does not tell you what the error is.

该错误ValueError: cannot reindex from a duplicate axis是这些非常非常神秘的Pandas错误之一,它根本不会告诉您错误是什么。

The error is often related to two columns being named the same either before or after (internally in) the operation.

该错误通常与在操作之前或之后(内部)命名相同的两列有关。