pandas 如何解决属性错误“float”对象在python中没有属性“split”?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/52736900/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to solve the Attribute error 'float' object has no attribute 'split' in python?
提问by School
When I run the below code, it gives me an error saying that there is attribute error: 'float' object has no attribute 'split' in python.
当我运行下面的代码时,它给我一个错误,说存在属性错误:'float' object has no attribute 'split' in python。
I would like to know why this error comes about.
我想知道为什么会出现这个错误。
def text_processing(df):
"""""=== Lower case ==="""
'''First step is to transform comments into lower case'''
df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in x.split() if x not in stop_words))
return df
df = text_processing(df)
The full traceback for the error:
错误的完整追溯:
Traceback (most recent call last):
File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.2.2\helpers\pydev\pydevd.py", line 1664, in <module>
main()
File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.2.2\helpers\pydev\pydevd.py", line 1658, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.2.2\helpers\pydev\pydevd.py", line 1068, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.2.2\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "C:/Users/L31307/Documents/FYP P3_Lynn_161015H/FYP 10.10.18 (Wed) still working on it/FYP/dataanalysis/category_analysis.py", line 53, in <module>
df = text_processing(df)
File "C:/Users/L31307/Documents/FYP P3_Lynn_161015H/FYP 10.10.18 (Wed) still working on it/FYP/dataanalysis/category_analysis.py", line 30, in text_processing
df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in x.split() if x not in stop_words))
File "C:\Users\L31307\AppData\Roaming\Python\Python37\site-packages\pandas\core\series.py", line 3194, in apply
mapped = lib.map_infer(values, f, convert=convert_dtype)
File "pandas/_libs/src\inference.pyx", line 1472, in pandas._libs.lib.map_infer
File "C:/Users/L31307/Documents/FYP P3_Lynn_161015H/FYP 10.10.18 (Wed) still working on it/FYP/dataanalysis/category_analysis.py", line 30, in <lambda>
df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in x.split() if x not in stop_words))
AttributeError: 'float' object has no attribute 'split'
回答by jpp
The error points to this line:
错误指向这一行:
df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in x.split() \
if x not in stop_words))
split
is being used here as a method of Python's built-in str
class. Your error indicates one or more values in df['content']
is of type float
. This could be because there is a null value, i.e. NaN
, or a non-null float value.
split
在这里用作 Python 内置str
类的方法。您的错误表明中的一个或多个值df['content']
的类型为float
。这可能是因为存在空值,即NaN
,或非空浮点值。
One workaround, which will stringify floats, is to just apply str
on x
before using split
:
一个解决办法,这将字符串化浮动,是只适用str
于x
使用前split
:
df['content'] = df['content'].apply(lambda x: " ".join(x.lower() for x in str(x).split() \
if x not in stop_words))
Alternatively, and possibly a better solution, be explicit and use a named function with a try
/ except
clause:
或者,可能是更好的解决方案,明确并使用带有try
/except
子句的命名函数:
def converter(x):
try:
return ' '.join([x.lower() for x in str(x).split() if x not in stop_words])
except AttributeError:
return None # or some other value
df['content'] = df['content'].apply(converter)
Since pd.Series.apply
is just a loop with overhead, you may find a list comprehension or map
more efficient:
由于pd.Series.apply
只是一个有开销的循环,您可能会发现列表理解或map
更有效:
df['content'] = [converter(x) for x in df['content']]
df['content'] = list(map(converter, df['content']))
回答by Dominique Paul
split() is a python method which is only applicable to strings. It seems that your column "content" not only contains strings but also other values like floats to which you cannot apply the .split() mehthod.
split() 是一种仅适用于字符串的 Python 方法。似乎您的“内容”列不仅包含字符串,还包含其他值,例如无法应用 .split() 方法的浮点数。
Try converting the values to a string by using str(x).split() or by converting the entire column to strings first, which would be more efficient. You do this as follows:
尝试使用 str(x).split() 将值转换为字符串,或者首先将整个列转换为字符串,这样效率会更高。您可以按如下方式执行此操作:
df['column_name'].astype(str)