删除列值类型为字符串 Pandas 的行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/26771471/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 22:38:07  来源:igfitidea点击:

Remove rows where column value type is string Pandas

pythonpandasdataframe

提问by porteclefs

I have a pandas dataframe. One of my columns should only be floats. When I try to convert that column to floats, I'm alerted that there are strings in there. I'd like to delete all rows where values in this column are strings...

我有一个Pandas数据框。我的一列应该只是浮点数。当我尝试将该列转换为浮点数时,我收到警告说那里有字符串。我想删除此列中值为字符串的所有行...

回答by EdChum

Use convert_objectswith param convert_numeric=Truethis will coerce any non numeric values to NaN:

convert_objects与 param 一起使用convert_numeric=True这将强制任何非数字值NaN

In [24]:

df = pd.DataFrame({'a': [0.1,0.5,'jasdh', 9.0]})
df
Out[24]:
       a
0    0.1
1    0.5
2  jasdh
3      9
In [27]:

df.convert_objects(convert_numeric=True)
Out[27]:
     a
0  0.1
1  0.5
2  NaN
3  9.0
In [29]:

You can then drop them:

然后你可以删除它们:

df.convert_objects(convert_numeric=True).dropna()
Out[29]:
     a
0  0.1
1  0.5
3  9.0

UPDATE

更新

Since version 0.17.0this method is now deprecatedand you need to use to_numericunfortunately this operates on a Seriesrather than a whole df so the equivalent code is now:

由于版本0.17.0此方法现已弃用to_numeric不幸的是,您需要使用它对一个Series而不是整个 df 进行操作,因此等效代码现在是:

df.apply(lambda x: pd.to_numeric(x, errors='coerce')).dropna()

回答by jpp

One of my columns should only be floats. I'd like to delete all rows where values in this column are strings

我的一列应该只是浮点数。我想删除此列中值为字符串的所有行

You can convert your series to numeric via pd.to_numericand then use pd.Series.notnull. Conversion to floatis required as a separate step to avoid your series reverting to objectdtype.

您可以通过将系列转换为数字pd.to_numeric,然后使用pd.Series.notnull. float需要将转换为单独的步骤,以避免您的系列恢复为objectdtype。

# Data from @EdChum

df = pd.DataFrame({'a': [0.1, 0.5, 'jasdh', 9.0]})

res = df[pd.to_numeric(df['a'], errors='coerce').notnull()]
res['a'] = res['a'].astype(float)

print(res)

     a
0  0.1
1  0.5
3  9.0

回答by Karthik V

You can find the data type of a column from the dtype.kindattribute. Something like df[col].dtype.kind. See the numpy docsfor more details. Transpose the dataframe to go from indices to columns.

您可以从dtype.kind属性中找到列的数据类型。类似的东西df[col].dtype.kind。有关更多详细信息,请参阅numpy 文档。将数据框从索引转置为列。

回答by geomars

Assume your data frame is dfand you wanted to ensure that all data in one of the column of your data frame is numeric in specific pandas dtype, e.g float:

假设您的数据框是,df并且您希望确保数据框列之一中的所有数据都是特定的数字pandas dtype,例如float

df[df.columns[n]] = df[df.columns[n]].apply(pd.to_numeric, errors='coerce').fillna(0).astype(float).dropna()