pandas 如何解决属性错误“float”对象在python中没有属性“split”?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/52736900/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 06:04:14  来源:igfitidea点击:

How to solve the Attribute error 'float' object has no attribute 'split' in python?

pythonstringpandasseriesattributeerror

提问by School

When I run the below code, it gives me an error saying that there is attribute error: 'float' object has no attribute 'split' in python.

当我运行下面的代码时,它给我一个错误,说存在属性错误:'float' object has no attribute 'split' in python。

I would like to know why this error comes about.

我想知道为什么会出现这个错误。

def text_processing(df):

    """""=== Lower case ==="""
    '''First step is to transform comments into lower case'''
    df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in x.split() if x not in stop_words))

    return df

df = text_processing(df)

The full traceback for the error:

错误的完整追溯:

Traceback (most recent call last):
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.2.2\helpers\pydev\pydevd.py", line 1664, in <module>
    main()
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.2.2\helpers\pydev\pydevd.py", line 1658, in main
    globals = debugger.run(setup['file'], None, None, is_module)
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.2.2\helpers\pydev\pydevd.py", line 1068, in run
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.2.2\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "C:/Users/L31307/Documents/FYP P3_Lynn_161015H/FYP 10.10.18 (Wed) still working on it/FYP/dataanalysis/category_analysis.py", line 53, in <module>
    df = text_processing(df)
  File "C:/Users/L31307/Documents/FYP P3_Lynn_161015H/FYP 10.10.18 (Wed) still working on it/FYP/dataanalysis/category_analysis.py", line 30, in text_processing
    df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in x.split() if x not in stop_words))
  File "C:\Users\L31307\AppData\Roaming\Python\Python37\site-packages\pandas\core\series.py", line 3194, in apply
    mapped = lib.map_infer(values, f, convert=convert_dtype)
  File "pandas/_libs/src\inference.pyx", line 1472, in pandas._libs.lib.map_infer
  File "C:/Users/L31307/Documents/FYP P3_Lynn_161015H/FYP 10.10.18 (Wed) still working on it/FYP/dataanalysis/category_analysis.py", line 30, in <lambda>
    df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in x.split() if x not in stop_words))
AttributeError: 'float' object has no attribute 'split'

回答by jpp

The error points to this line:

错误指向这一行:

df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in x.split() \
                                    if x not in stop_words))

splitis being used here as a method of Python's built-in strclass. Your error indicates one or more values in df['content']is of type float. This could be because there is a null value, i.e. NaN, or a non-null float value.

split在这里用作 Python 内置str类的方法。您的错误表明中的一个或多个值df['content']的类型为float。这可能是因为存在空值,即NaN,或非空浮点值。

One workaround, which will stringify floats, is to just apply stron xbefore using split:

一个解决办法,这将字符串化浮动,是只适用strx使用前split

df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in str(x).split() \
                                    if x not in stop_words))

Alternatively, and possibly a better solution, be explicit and use a named function with a try/ exceptclause:

或者,可能是更好的解决方案,明确并使用带有try/except子句的命名函数:

def converter(x):
    try:
        return ' '.join([x.lower() for x in str(x).split() if x not in stop_words])
    except AttributeError:
        return None  # or some other value

df['content'] = df['content'].apply(converter)

Since pd.Series.applyis just a loop with overhead, you may find a list comprehension or mapmore efficient:

由于pd.Series.apply只是一个有开销的循环,您可能会发现列表理解或map更有效:

df['content'] = [converter(x) for x in df['content']]
df['content'] = list(map(converter, df['content']))

回答by Dominique Paul

split() is a python method which is only applicable to strings. It seems that your column "content" not only contains strings but also other values like floats to which you cannot apply the .split() mehthod.

split() 是一种仅适用于字符串的 Python 方法。似乎您的“内容”列不仅包含字符串,还包含其他值,例如无法应用 .split() 方法的浮点数。

Try converting the values to a string by using str(x).split() or by converting the entire column to strings first, which would be more efficient. You do this as follows:

尝试使用 str(x).split() 将值转换为字符串,或者首先将整个列转换为字符串,这样效率会更高。您可以按如下方式执行此操作:

df['column_name'].astype(str)