pandas Python TypeError：'numpy.int32'对象不可迭代

Question

提问by bananablue1

I am trying to take the entropy of my k-means result dataframe and I am getting the error back: TypeError: 'numpy.int32' object is not iterable I dont understand why.

我正在尝试获取我的 k-means 结果数据帧的熵，但我又得到了错误：TypeError: 'numpy.int32' object is not iterable 我不明白为什么。

from collections import Counter 
def calcEntropy(x):
    p, lens = Counter(x), np.float(len(x))
    return -np.sum(count/lens*np.log2(count/lens) for count in p.values())
k_means_sp['entropy']=[calcEntropy(x) for x in k_means_sp['cluster']]

and then I get the error message:

然后我收到错误消息：

<ipython-input-26-d375ecf00330> in <module>()
----> 1 k_means_sp['entropy']=[calcEntropy(x) for x in k_means_sp['cluster']]

<ipython-input-26-d375ecf00330> in <listcomp>(.0)
----> 1 k_means_sp['entropy']=[calcEntropy(x) for x in k_means_sp['cluster']]

<ipython-input-23-f5508ea8782c> in calcEntropy(x)
      1 from collections import Counter
      2 def calcEntropy(x):
----> 3     p, lens = Counter(x), np.float(len(x))
      4     return -np.sum(count/lens*np.log2(count/lens) for count in p.values())

/Users/mpiercy/anaconda/lib/python3.6/collections/__init__.py in __init__(*args, **kwds)
    535             raise TypeError('expected at most 1 arguments, got %d' % len(args))
    536         super(Counter, self).__init__()
--> 537         self.update(*args, **kwds)
    538 
    539     def __missing__(self, key):

/Users/mpiercy/anaconda/lib/python3.6/collections/__init__.py in update(*args, **kwds)
    622                     super(Counter, self).update(iterable) # fast path when counter is empty
    623             else:
--> 624                 _count_elements(self, iterable)
    625         if kwds:
    626             self.update(kwds)

TypeError: 'numpy.int32' object is not iterable

k_means_sp.head()

      credit    debit   cluster
0   9.207673    8.198884    1
1   4.248495    8.202181    0
2   8.149668    7.735145    2
3   5.138677    7.859741    0
4   8.058163    7.918614    2

Answer 1

采纳答案by Robbie Jones

Ok this is a first attempt. It looks like your dataframe stores the cluster index in the 'cluster'column. So what you need to do is get each cluster based on the index, and then pass that cluster to your calcEntropyfunction, something like

好的，这是第一次尝试。看起来您的数据框将簇索引存储在'cluster'列中。因此，您需要做的是根据索引获取每个集群，然后将该集群传递给您的calcEntropy函数，例如

for i in xrange(len(k_means_sp['cluster'].unique())) # loop thru cluster indices:
    cluster = k_means_sp.ix[k_means_sp['cluster'] == i][['credit', 'debit']]
    entropy = calcEntropy(cluster)

The second line filters out the rows to only the ones that have the same cluster index. Does this help?

第二行将行过滤为仅具有相同簇索引的行。这有帮助吗？

pandas Python TypeError：'numpy.int32'对象不可迭代

提问by bananablue1

采纳答案by Robbie Jones

相关推荐

最近更新

标签

pandas Python TypeError：'numpy.int32'对象不可迭代

提问by bananablue1

采纳答案by Robbie Jones

相关推荐

Pandas.read_excel：不支持的格式，或损坏的文件：预期的 BOF 记录

pandas “pandas_datareader”中的“get_data_yahoo”返回空数据帧

pandas 行子集的一列上的熊猫标准偏差

pandas 修改熊猫图的日期刻度

相关推荐

最近更新

标签