pandas 解决重新索引仅对唯一值的索引对象有效

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/21533706/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 21:39:30  来源:igfitidea点击:

Resolving Reindexing only valid with uniquely valued Index objects

pythonpandas

提问by Paul

I have viewed many of the questions that come up with this error. I am running pandas '0.10.1'

我已经查看了许多与此错误相关的问题。我正在运行Pandas“0.10.1”

df = DataFrame({'A' : np.random.randn(5),
 'B' : np.random.randn(5),'C' : np.random.randn(5), 
  'D':['a','b','c','d','e'] })

#gives error
df.take([2,0,1,2,3], axis=1).drop(['C'],axis=1)

#works fine
df.take([2,0,1,2,1], axis=1).drop(['C'],axis=1)

Only thing I can see is that in the former case I have the non-numeric column, which seems to be affecting the index somehow but the below command returns empty:

我唯一能看到的是,在前一种情况下,我有非数字列,这似乎以某种方式影响了索引,但以下命令返回空:

df.take([2,0,1,2,3], axis=1).index.get_duplicates()

Reindexing error makes no sensedoes not seem to apply as my old index is unique.

重新索引错误似乎不适用,因为我的旧索引是唯一的。

My index appears unique as far as I can tell using this command df.take([2,0,1,2,3], axis=1).index.get_duplicates() from this Q&A: problems with reindexing dataframes: Reindexing only valid with uniquely valued Index objects

:我可以告诉使用该命令df.take([2,0,1,2,3],轴= 1)从该Q&A .index.get_duplicates()我的索引显示为远独特与重新索引dataframes问题:只重新编制索引对唯一值的索引对象有效

"Reindexing only valid with uniquely valued Index objects"does not seem to apply

“重新索引仅对唯一值的索引对象有效”似乎不适用

I think my pandas version# is ok so this should bug should not be the problem pandas Reindexing only valid with uniquely valued Index objects

我认为我的Pandas版本# 没问题,所以这应该不是问题Pandas重新索引仅对唯一值的索引对象有效

回答by simple

Firstly, I believe you meant to test for duplicates using the following command:

首先,我相信您打算使用以下命令测试重复项:

df.take([2,0,1,2,3],axis=1).columns.get_duplicates()

because if you used index instead of columns, then it would obviously returned an empty array because the random float values don't repeat. The above command returns, as expected:

因为如果您使用索引而不是列,那么它显然会返回一个空数组,因为随机浮点值不会重复。上面的命令按预期返回:

['C']

Secondly, I think you're right, the non-numeric column is throwing it off, because even if you use the following, there is still an error:

其次,我认为您是对的,非数字列将其丢弃,因为即使您使用以下内容,仍然存在错误:

df = DataFrame({'A' : np.random.randn(5), 'B' : np.random.randn(5),'C' :np.random.randn(5), 'D':[str(x) for x in np.random.randn(5) ]})

It could be a bug, because if you check out the core file called 'index.py', on line 86, and line 1228, the type it is expecting is either (respectively):

这可能是一个错误,因为如果您在第 86 行和第 1228 行查看名为“index.py”的核心文件,它所期望的类型是(分别):

_engine_type = _index.ObjectEngine


_engine_type = _index.Int64Engine

and neither of those seem to be expecting a string, if you look deeper into the documentation. That's the best I got, good luck!! Let me know if you solve this as I'm interested too.

如果您更深入地查看文档,那么这些似乎都不需要字符串。这是我得到的最好的,祝你好运!!如果您解决了这个问题,请告诉我,因为我也很感兴趣。