Python TypeError: unhashable type: 'slice' for pandas
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/44871017/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
TypeError: unhashable type: 'slice' for pandas
提问by octavian
I have a pandas datastructure, which I create like this:
我有一个熊猫数据结构,我像这样创建:
test_inputs = pd.read_csv("../input/test.csv", delimiter=',')
Its shape
它的形状
print(test_inputs.shape)
is this
这是
(28000, 784)
I would like to print a subset of its rows, like this:
我想打印其行的子集,如下所示:
print(test_inputs[100:200, :])
print(test_inputs[100:200, :].shape)
However, I am getting:
但是,我得到:
TypeError: unhashable type: 'slice'
Any idea what could be wrong?
知道有什么问题吗?
采纳答案by jezrael
There is more possible solutions, but output is not same:
有更多可能的解决方案,但输出不一样:
loc
selects by labels, but iloc
and slicing without function, the start bounds is included, while the upper bound is excluded, docs - select by positions:
loc
通过标签选择,但iloc
并没有切片功能,起动界限被包括,而上限是排除,文档-由位置选择:
test_inputs = pd.DataFrame(np.random.randint(10, size=(28, 7)))
print(test_inputs.loc[10:20])
0 1 2 3 4 5 6
10 3 2 0 6 6 0 0
11 5 0 2 4 1 5 2
12 5 3 5 4 1 3 5
13 9 5 6 6 5 0 1
14 7 0 7 4 2 2 5
15 2 4 3 3 7 2 3
16 8 9 6 0 5 3 4
17 1 1 0 7 2 7 7
18 1 2 2 3 5 8 7
19 5 1 1 0 1 8 9
20 3 6 7 3 9 7 1
print(test_inputs.iloc[10:20])
0 1 2 3 4 5 6
10 3 2 0 6 6 0 0
11 5 0 2 4 1 5 2
12 5 3 5 4 1 3 5
13 9 5 6 6 5 0 1
14 7 0 7 4 2 2 5
15 2 4 3 3 7 2 3
16 8 9 6 0 5 3 4
17 1 1 0 7 2 7 7
18 1 2 2 3 5 8 7
19 5 1 1 0 1 8 9
print(test_inputs[10:20])
0 1 2 3 4 5 6
10 3 2 0 6 6 0 0
11 5 0 2 4 1 5 2
12 5 3 5 4 1 3 5
13 9 5 6 6 5 0 1
14 7 0 7 4 2 2 5
15 2 4 3 3 7 2 3
16 8 9 6 0 5 3 4
17 1 1 0 7 2 7 7
18 1 2 2 3 5 8 7
19 5 1 1 0 1 8 9
回答by Leonid Mednikov
Indexing in pandas is really confusing, as it looks like list indexing but it is not. You need to use .iloc
, which is indexing by position
pandas 中的索引确实令人困惑,因为它看起来像列表索引,但事实并非如此。您需要使用.iloc
,这是按位置索引
print(test_inputs.iloc[100:200, :])
And if you don't use column selection you can omit it
如果您不使用列选择,则可以省略它
print(test_inputs.iloc[100:200])
P.S. Using .loc
(or just []
) is not what you want, as it would look not for the row number, but for the row index (which can be filled we anything, not even numbers, not even unique). Ranges in .loc
will find rows with index value 100 and 200, and return the lines between. If you just created the DataFrame .iloc
and .loc
may give the same result, but using .loc
in this case is a very bad practice as it will lead you to difficult to understand problem when the index will change for some reason (for example you'll select some subset of rows, and from that moment the row number and index will not be the same).
PS Using .loc
(or just []
) 不是你想要的,因为它不会查找行号,而是查找行索引(可以填充我们任何东西,不是偶数,甚至不是唯一的)。Ranges in.loc
将查找索引值为 100 和 200 的行,并返回它们之间的行。如果您刚刚创建了 DataFrame.iloc
并且.loc
可能会给出相同的结果,但.loc
在这种情况下使用是一种非常糟糕的做法,因为当索引由于某种原因发生变化时(例如,您将选择一些子集),这将导致您难以理解问题行,从那一刻起,行号和索引将不再相同)。
回答by user_gautam
I was facing the same problem. Even the above solutions couldn't fix it. It was some problem with pandas, What I did was I changed the array into a numpy array and then there was no problem.
我面临同样的问题。即使上述解决方案也无法解决它。熊猫有一些问题,我所做的是将数组更改为 numpy 数组,然后就没有问题了。
import pandas as pd
import numpy as np
test_inputs = pd.read_csv("../input/test.csv", delimiter=',')
test_inputs = np.asarray(test_inputs)
回答by vipin bansal
print(test_inputs.values[100:200, :])
print(test_inputs.values[100:200, :].shape)
This code is also working for me.
这段代码也对我有用。