Python TypeError: unhashable type: 'slice' for pandas

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/44871017/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-20 00:33:08  来源:igfitidea点击:

TypeError: unhashable type: 'slice' for pandas

pythonpandas

提问by octavian

I have a pandas datastructure, which I create like this:

我有一个熊猫数据结构,我像这样创建:

test_inputs = pd.read_csv("../input/test.csv", delimiter=',')

Its shape

它的形状

print(test_inputs.shape)

is this

这是

(28000, 784)

I would like to print a subset of its rows, like this:

我想打印其行的子集,如下所示:

print(test_inputs[100:200, :])
print(test_inputs[100:200, :].shape)

However, I am getting:

但是,我得到:

TypeError: unhashable type: 'slice'

Any idea what could be wrong?

知道有什么问题吗?

采纳答案by jezrael

There is more possible solutions, but output is not same:

有更多可能的解决方案,但输出不一样:

locselects by labels, but ilocand slicing without function, the start bounds is included, while the upper bound is excluded, docs - select by positions:

loc通过标签选择,但iloc并没有切片功能,起动界限被包括,而上限是排除文档-由位置选择

test_inputs = pd.DataFrame(np.random.randint(10, size=(28, 7)))

print(test_inputs.loc[10:20])
    0  1  2  3  4  5  6
10  3  2  0  6  6  0  0
11  5  0  2  4  1  5  2
12  5  3  5  4  1  3  5
13  9  5  6  6  5  0  1
14  7  0  7  4  2  2  5
15  2  4  3  3  7  2  3
16  8  9  6  0  5  3  4
17  1  1  0  7  2  7  7
18  1  2  2  3  5  8  7
19  5  1  1  0  1  8  9
20  3  6  7  3  9  7  1


print(test_inputs.iloc[10:20])
    0  1  2  3  4  5  6
10  3  2  0  6  6  0  0
11  5  0  2  4  1  5  2
12  5  3  5  4  1  3  5
13  9  5  6  6  5  0  1
14  7  0  7  4  2  2  5
15  2  4  3  3  7  2  3
16  8  9  6  0  5  3  4
17  1  1  0  7  2  7  7
18  1  2  2  3  5  8  7
19  5  1  1  0  1  8  9

print(test_inputs[10:20])
    0  1  2  3  4  5  6
10  3  2  0  6  6  0  0
11  5  0  2  4  1  5  2
12  5  3  5  4  1  3  5
13  9  5  6  6  5  0  1
14  7  0  7  4  2  2  5
15  2  4  3  3  7  2  3
16  8  9  6  0  5  3  4
17  1  1  0  7  2  7  7
18  1  2  2  3  5  8  7
19  5  1  1  0  1  8  9

回答by Leonid Mednikov

Indexing in pandas is really confusing, as it looks like list indexing but it is not. You need to use .iloc, which is indexing by position

pandas 中的索引确实令人困惑,因为它看起来像列表索引,但事实并非如此。您需要使用.iloc,这是按位置索引

print(test_inputs.iloc[100:200, :])

And if you don't use column selection you can omit it

如果您不使用列选择,则可以省略它

print(test_inputs.iloc[100:200])

P.S. Using .loc(or just []) is not what you want, as it would look not for the row number, but for the row index (which can be filled we anything, not even numbers, not even unique). Ranges in .locwill find rows with index value 100 and 200, and return the lines between. If you just created the DataFrame .ilocand .locmay give the same result, but using .locin this case is a very bad practice as it will lead you to difficult to understand problem when the index will change for some reason (for example you'll select some subset of rows, and from that moment the row number and index will not be the same).

PS Using .loc(or just []) 不是你想要的,因为它不会查找行号,而是查找行索引(可以填充我们任何东西,不是偶数,甚至不是唯一的)。Ranges in.loc将查找索引值为 100 和 200 的行,并返回它们之间的行。如果您刚刚创建了 DataFrame.iloc并且.loc可能会给出相同的结果,但.loc在这种情况下使用是一种非常糟糕的做法,因为当索引由于某种原因发生变化时(例如,您将选择一些子集),这将导致您难以理解问题行,从那一刻起,行号和索引将不再相同)。

回答by user_gautam

I was facing the same problem. Even the above solutions couldn't fix it. It was some problem with pandas, What I did was I changed the array into a numpy array and then there was no problem.

我面临同样的问题。即使上述解决方案也无法解决它。熊猫有一些问题,我所做的是将数组更改为 numpy 数组,然后就没有问题了。

import pandas as pd
import numpy as np
test_inputs = pd.read_csv("../input/test.csv", delimiter=',')
test_inputs = np.asarray(test_inputs)

回答by vipin bansal

print(test_inputs.values[100:200, :])
print(test_inputs.values[100:200, :].shape)

This code is also working for me.

这段代码也对我有用。