Python AttributeError: 'float' 对象没有属性 'split'

Question

提问by Dhruv Ghulati

I am calling this line:

我打电话给这条线：

lang_modifiers = [keyw.strip() for keyw in row["language_modifiers"].split("|") if not isinstance(row["language_modifiers"], float)]

This seems to work where row["language_modifiers"]is a word (atlas method, central), but not when it comes up as nan.

这似乎适用于row["language_modifiers"]单词 ( atlas method, central) 的位置，但当它出现为nan.

I thought my if not isinstance(row["language_modifiers"], float)could catch the time when things come up as nanbut not the case.

我以为我if not isinstance(row["language_modifiers"], float)可以赶上事情出现的时间，nan但事实并非如此。

Background: row["language_modifiers"]is a cell in a tsv file, and comes up as nanwhen that cell was empty in the tsv being parsed.

背景：row["language_modifiers"]是 tsv 文件中的一个单元格，nan当该单元格在被解析的 tsv 中为空时出现。

Answer 1

回答by Ozgur Ozturk

You are right, such errors mostly caused by NaN representing empty cells. It is common to filter out such data, before applying your further operations, using this idiom on your dataframe df:

您是对的，此类错误主要是由表示空单元格的 NaN 引起的。在应用您的进一步操作之前，在您的数据帧 df 上使用此习语来过滤掉此类数据是很常见的：

df_new = df[df['ColumnName'].notnull()]

Alternatively, it may be more handy to use fillna()method to impute (to replace) nullvalues with something default. E.g. all nullor NaN's can be replaced with the average value for its column

或者，使用fillna()方法null用默认值来估算（替换）值可能更方便。例如，所有nullorNaN可以替换为其列的平均值

housing['LotArea'] = housing['LotArea'].fillna(housing.mean()['LotArea'])

or can be replaced with a value like empty string "" or another default value

或者可以替换为空字符串 "" 或其他默认值之类的值

housing['GarageCond']=housing['GarageCond'].fillna("")

Answer 2

回答by hpl002

You might also use df = df.dropna(thresh=n)where nis the tolerance. Meaning, it requires n Non-NA values to not drop the row

您还可以使用df = df.dropna(thresh=n)其中n的公差。意思是，它需要n 个非 NA 值才能不删除行

Mind you, this approach will remove the row

请注意，这种方法将删除该行

For example: If you have a dataframe with 5 columns, df.dropna(thresh=5)would drop any row that does not have 5 valid, or non-Na values.

例如：如果您有一个包含 5 列的数据框，df.dropna(thresh=5)将删除没有 5 个有效值或非 Na 值的任何行。

In your case you might only want to keep valid rows; if so, you can set the threshold to the number of columns you have.

在您的情况下，您可能只想保留有效行；如果是这样，您可以将阈值设置为您拥有的列数。

pandas documentation on dropna

关于 dropna 的 pandas 文档

Python AttributeError: 'float' 对象没有属性 'split'

提问by Dhruv Ghulati

回答by Ozgur Ozturk

回答by hpl002

相关推荐

最近更新

标签

Python AttributeError: 'float' 对象没有属性 'split'

提问by Dhruv Ghulati

回答by Ozgur Ozturk

回答by hpl002

相关推荐

Python 使用 Pandas 将每日数据重新采样为每月（日期格式）

Python 对 Pandas 数据框中的所有值求和的最佳方法是什么？

拆分数据集中的Python随机状态

Python 类型错误：列表索引必须是整数，而不是 str（实际上是布尔转换）

相关推荐

最近更新

标签