pandas 调试类型错误:不可散列类型:'numpy.ndarray'

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/35885693/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:50:06  来源:igfitidea点击:

Debug TypeError: unhashable type: 'numpy.ndarray'

pythonnumpypandasmatplotlib

提问by jax

I am working on a kmeans clustering. I have write down a code with the help of some available references on the web but when I run this code it fires an error:

我正在研究 kmeans 聚类。我已经在网络上一些可用参考的帮助下写下了代码,但是当我运行此代码时,它会触发一个错误:

    Traceback (most recent call last):
  File "clustering.py", line 16, in <module>
    ds = df[np.where(labels==i)]
  File "/usr/lib/python2.7/dist-packages/pandas/core/frame.py", line 1678, in __getitem__
    return self._getitem_column(key)
  File "/usr/lib/python2.7/dist-packages/pandas/core/frame.py", line 1685, in _getitem_column
    return self._get_item_cache(key)
  File "/usr/lib/python2.7/dist-packages/pandas/core/generic.py", line 1050, in _get_item_cache
    res = cache.get(item)
TypeError: unhashable type: 'numpy.ndarray'

Though, many previous threads are available with the same error but there is no single solution available that can handle this error in my program. How can I debug this error ?

虽然,许多以前的线程都存在相同的错误,但没有可用的单一解决方案可以在我的程序中处理此错误。我该如何调试这个错误?

Code which i used:

我使用的代码:

from sklearn import cluster
import pandas as pd

df = [
[0.57,-0.845,-0.8277,-0.1585,-1.616],
[0.47,-0.14,-0.5277,-0.158,-1.716],
[0.17,-0.845,-0.5277,-0.158,-1.616],
[0.27,-0.14,-0.8277,-0.158,-1.716]]

df = pd.DataFrame(df,columns= ["a","b","c","d", "e"])

# df = pd.read_csv("cleaned_remove_cor.csv")

k = 3
kmeans = cluster.KMeans(n_clusters=k)
kmeans.fit(df)
labels = kmeans.labels_
centroids = kmeans.cluster_centers_
from matplotlib import pyplot
import numpy as np

for i in range(k):
    # select only data observations with cluster label == i
    ds = df[np.where(labels==i)]
    # plot the data observations
    pyplot.plot(ds[:,0],ds[:,1],'o')
    # plot the centroids
    lines = pyplot.plot(centroids[i,0],centroids[i,1],'kx')
    # make the centroid x's bigger
    pyplot.setp(lines,ms=15.0)
    pyplot.setp(lines,mew=2.0)
pyplot.show()

The shape of my DataFrame is (8127x600)

我的 DataFrame 的形状是 (8127x600)

回答by jax

I tried and this works for me, conversion of pandas df to numpy matrix:

我试过了,这对我有用,将 pandas df 转换为 numpy 矩阵:

df = df.as_matrix(columns= ["a","b","c","d", "e"])