pandas 如何解决属性错误“float”对象在python中没有属性“split”？

Question

提问by School

When I run the below code, it gives me an error saying that there is attribute error: 'float' object has no attribute 'split' in python.

当我运行下面的代码时，它给我一个错误，说存在属性错误：'float' object has no attribute 'split' in python。

I would like to know why this error comes about.

我想知道为什么会出现这个错误。

def text_processing(df):

    """""=== Lower case ==="""
    '''First step is to transform comments into lower case'''
    df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in x.split() if x not in stop_words))

    return df

df = text_processing(df)

The full traceback for the error:

错误的完整追溯：

Traceback (most recent call last):
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.2.2\helpers\pydev\pydevd.py", line 1664, in <module>
    main()
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.2.2\helpers\pydev\pydevd.py", line 1658, in main
    globals = debugger.run(setup['file'], None, None, is_module)
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.2.2\helpers\pydev\pydevd.py", line 1068, in run
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.2.2\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "C:/Users/L31307/Documents/FYP P3_Lynn_161015H/FYP 10.10.18 (Wed) still working on it/FYP/dataanalysis/category_analysis.py", line 53, in <module>
    df = text_processing(df)
  File "C:/Users/L31307/Documents/FYP P3_Lynn_161015H/FYP 10.10.18 (Wed) still working on it/FYP/dataanalysis/category_analysis.py", line 30, in text_processing
    df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in x.split() if x not in stop_words))
  File "C:\Users\L31307\AppData\Roaming\Python\Python37\site-packages\pandas\core\series.py", line 3194, in apply
    mapped = lib.map_infer(values, f, convert=convert_dtype)
  File "pandas/_libs/src\inference.pyx", line 1472, in pandas._libs.lib.map_infer
  File "C:/Users/L31307/Documents/FYP P3_Lynn_161015H/FYP 10.10.18 (Wed) still working on it/FYP/dataanalysis/category_analysis.py", line 30, in <lambda>
    df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in x.split() if x not in stop_words))
AttributeError: 'float' object has no attribute 'split'

Answer 1

回答by jpp

The error points to this line:

错误指向这一行：

df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in x.split() \
                                    if x not in stop_words))

splitis being used here as a method of Python's built-in strclass. Your error indicates one or more values in df['content']is of type float. This could be because there is a null value, i.e. NaN, or a non-null float value.

split在这里用作 Python 内置str类的方法。您的错误表明中的一个或多个值df['content']的类型为float。这可能是因为存在空值，即NaN，或非空浮点值。

One workaround, which will stringify floats, is to just apply stron xbefore using split:

一个解决办法，这将字符串化浮动，是只适用str于x使用前split：

df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in str(x).split() \
                                    if x not in stop_words))

Alternatively, and possibly a better solution, be explicit and use a named function with a try/ exceptclause:

或者，可能是更好的解决方案，明确并使用带有try/except子句的命名函数：

def converter(x):
    try:
        return ' '.join([x.lower() for x in str(x).split() if x not in stop_words])
    except AttributeError:
        return None  # or some other value

df['content'] = df['content'].apply(converter)

Since pd.Series.applyis just a loop with overhead, you may find a list comprehension or mapmore efficient:

由于pd.Series.apply只是一个有开销的循环，您可能会发现列表理解或map更有效：

df['content'] = [converter(x) for x in df['content']]
df['content'] = list(map(converter, df['content']))

Answer 2

回答by Dominique Paul

split() is a python method which is only applicable to strings. It seems that your column "content" not only contains strings but also other values like floats to which you cannot apply the .split() mehthod.

split() 是一种仅适用于字符串的 Python 方法。似乎您的“内容”列不仅包含字符串，还包含其他值，例如无法应用 .split() 方法的浮点数。

Try converting the values to a string by using str(x).split() or by converting the entire column to strings first, which would be more efficient. You do this as follows:

尝试使用 str(x).split() 将值转换为字符串，或者首先将整个列转换为字符串，这样效率会更高。您可以按如下方式执行此操作：

df['column_name'].astype(str)

pandas 如何解决属性错误“float”对象在python中没有属性“split”？

提问by School

回答by jpp

回答by Dominique Paul

相关推荐

最近更新

标签

pandas 如何解决属性错误“float”对象在python中没有属性“split”？

提问by School

回答by jpp

回答by Dominique Paul

相关推荐

pandas ValueError：在将索引与seaborn lineplot一起使用时无法解释输入“索引”

使用 Pandas 和日期时间格式绘图

pandas 如何在新图像上使用 .predict_generator() - Keras

在 Pandas 中将 DatetimeIndex 转换为 datetime.date？

相关推荐

最近更新

标签