pandas 类型错误：'float' 类型的对象没有 len() & TypeError：'float' 对象不可迭代

Question

提问by M.Z

I have a dataset imported as DataFrame "new_data_words". There is a column "page_name" containing messy webpage names, like "%D8%AA%D8%B5%D9%86%D9%8A%D9%81:%D8%A2%D9%84%D9...", "%D9%85%D9%84%D9%81:IT-Airforce-OR2.png" or simply "1950". I want to create a new column 'word_count' to have the count of words in the page name (words are delimited by '_')

我有一个作为 DataFrame "new_data_words" 导入的数据集。有一列“page_name”包含乱七八糟的网页名称，例如“ %D8%AA%D8%B5%D9%86%D9%8A%D9%81:%D8%A2%D9%84%D9...”、“ %D9%85%D9%84%D9%81:IT-Airforce-OR2.png”或简单的“ 1950”。我想创建一个新列“word_count”来统计页面名称中的单词数（单词以“_”分隔）

Here are my codes:

这是我的代码：

To split to words:

拆分为单词：

b = list(new_data_words['page_name'].str.split('_'))
new_data_words['words'] = b

I checked the type of b is listtype and len(b) is 6035980. One sample value:

我检查了 b 的类型是列表类型， len(b) 是6035980。一个样本值：

In [1]: new_data_words.loc[0,'words']
Out[2]: ['%D8%AA%D8%B5%D9%86%D9%8A%D9%81:%D8%A2%D9%84%D9%87%D8%A9',
         '%D8%A8%D9%84%D8%A7%D8%AF',
         '%D8%A7%D9%84%D8%B1%D8%A7%D9%81%D8%AF%D9%8A%D9%86']

I created another column "word_count" to count the elements of the list in each row of column "words". (Have to use loop to touch the elements of list in each row)

我创建了另一列“word_count”来计算“words”列的每一行中列表的元素。（必须使用循环来触摸每一行中的列表元素）

But I had errors:

但我有错误：

x = []
i = []
c = 0
for i in b:    # i is list type, with elements are string, I checked
    c=c+1
    x.append(len(i))

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-12-c0cf0cfbc458> in <module>()
      6         #y = str(y)
      7     c=c+1
----> 8     x.append(len(i))

TypeError: object of type 'float' has no len()

I don't know why it is float type.....

我不知道为什么它是浮点型.....

However if I only add a print, it worked

但是，如果我只添加一个打印件，它就起作用了

x = []
i = []
c = 0
for i in b:
    c=c+1
    print len(i)
    x.append(len(i))

3
2
3
2
3
1
8
...

But c = len(x) = 68516, much smaller than 6 millions.

但是c = len(x) = 68516，远小于600万。

I tried to force the elements to be string again, another error happened:

我试图再次强制元素为字符串，发生了另一个错误：

x = []
for i in b:
    for y in i:
        y = str(y)
    x.append(len(i))


TypeError                                 Traceback (most recent call last)
<ipython-input-164-c86f5f48b80c> in <module>()
      1 x = []
      2 for i in b:
----> 3     for y in i:
      4         y = str(y)
      5     x.append(len(i))
TypeError: 'float' object is not iterable

I think i is list type and is iterable...

我认为我是列表类型并且是可迭代的......

Again, if I did not append, but only print, it worked:

同样，如果我没有附加，而只是打印，它会起作用：

x = []
for i in b:
    for y in i:
        y = str(y)
    print (len(i))

Another example: This works:

另一个例子：这有效：

a = []
for i in range(10000):
    a.append(len(new_data_words.loc[i,"words"]))

Changed to a dynamic range, it does not work:

改成动态范围，不行：

a = []
for i in range(len(b)):
    a.append(len(new_data_words.loc[i,"words"]))


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-20-f9d0af3c448f> in <module>()
      1 a = []
      2 for i in range(len(b)):
----> 3     a.append(len(new_data_words.loc[i,"words"]))

TypeError: object of type 'float' has no len()

This does not work either......

这也行不通......

a = []
for i in range(6035980):
    a.append(len(new_data_words.loc[i,"words"]))

Seems like there are some abnormal in the list. But I don't know what that is or how to find it.

列表中似乎有一些异常。但我不知道那是什么或如何找到它。

Anyone can help please?

任何人都可以帮忙吗？

Answer 1

回答by ShadowRanger

You're wrong. The errors you're seeing make it 100% clear that bis an iterable containing at least one float(whether the other elements are stror not I won't speculate).

你错了。您看到的错误使其 100% 清楚，这b是一个至少包含一个的可迭代对象float（无论其他元素是否str存在，我不会推测）。

Try doing:

尝试做：

for i in b:
    print(type(i), i)

and you'll see there is at least one float. Or this to only print the non-iterable components of b:

你会看到至少有一个float. 或者这仅打印以下的不可迭代组件b：

import collections

for i in b:
    if not isinstance(i, collections.Iterable):
        print(type(i), i)

pandas 类型错误：'float' 类型的对象没有 len() & TypeError：'float' 对象不可迭代

提问by M.Z

回答by ShadowRanger

相关推荐

最近更新

标签

pandas 类型错误：'float' 类型的对象没有 len() & TypeError：'float' 对象不可迭代

提问by M.Z

回答by ShadowRanger

相关推荐

pandas 基于 DataFrame 列名称的颜色 seaborn boxplot

pandas 访问熊猫系列的索引

pandas 多处理写入熊猫数据帧

pandas 使用 pd.read_json 读取 JSON 文件时出现 ValueError 错误

相关推荐

最近更新

标签