pandas TypeError: 'DataFrame' 对象是可变的,因此它们不能被散列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/43239097/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
TypeError: 'DataFrame' objects are mutable, thus they cannot be hashed
提问by 0AJ0
My Code:
我的代码:
samples = pd.read_csv("https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data", sep=',',header=None)
varieties = pd.DataFrame(samples.iloc[:,0])
kmeans = KMeans(n_clusters = 3)
labels = kmeans.fit_predict(samples)
#setting 'labels' according to given data
labels += 1
#converting 'labels' to pandas DataFrame
labels = pd.DataFrame(labels)
df = pd.DataFrame({'labels':[labels], 'varieties':[varieties]})
ct = pd.crosstab(df['labels'],df['varieties'])
I want to use these dataframes (labels and varieties) for 'crosstab' function. Please do let me know how I can do that?
我想将这些数据框(标签和品种)用于“交叉表”功能。请让我知道我该怎么做?
采纳答案by Geoff Perrin
Why are you storing the labels in a separate dataframe? Might be easier to save it just as a new column in the variaties dataframe, and then run crosstab between those two columns.
为什么要将标签存储在单独的数据框中?将它保存为变量数据框中的新列可能更容易,然后在这两列之间运行交叉表。
samples = pd.read_csv("https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data", sep=',',header=None)
varieties = pd.DataFrame(samples.iloc[:,0])
kmeans = KMeans(n_clusters = 3)
varieties['labels'] = kmeans.fit_predict(samples)
#setting 'labels' according to given data
varieties['labels'] += 1
pd.crosstab(varieties.iloc[:,0], varieties['labels'])