删除列值类型为字符串 Pandas 的行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/26771471/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Remove rows where column value type is string Pandas
提问by porteclefs
I have a pandas dataframe. One of my columns should only be floats. When I try to convert that column to floats, I'm alerted that there are strings in there. I'd like to delete all rows where values in this column are strings...
我有一个Pandas数据框。我的一列应该只是浮点数。当我尝试将该列转换为浮点数时,我收到警告说那里有字符串。我想删除此列中值为字符串的所有行...
回答by EdChum
Use convert_objectswith param convert_numeric=Truethis will coerce any non numeric values to NaN:
convert_objects与 param 一起使用convert_numeric=True这将强制任何非数字值NaN:
In [24]:
df = pd.DataFrame({'a': [0.1,0.5,'jasdh', 9.0]})
df
Out[24]:
a
0 0.1
1 0.5
2 jasdh
3 9
In [27]:
df.convert_objects(convert_numeric=True)
Out[27]:
a
0 0.1
1 0.5
2 NaN
3 9.0
In [29]:
You can then drop them:
然后你可以删除它们:
df.convert_objects(convert_numeric=True).dropna()
Out[29]:
a
0 0.1
1 0.5
3 9.0
UPDATE
更新
Since version 0.17.0this method is now deprecatedand you need to use to_numericunfortunately this operates on a Seriesrather than a whole df so the equivalent code is now:
由于版本0.17.0此方法现已弃用,to_numeric不幸的是,您需要使用它对一个Series而不是整个 df 进行操作,因此等效代码现在是:
df.apply(lambda x: pd.to_numeric(x, errors='coerce')).dropna()
回答by jpp
One of my columns should only be floats. I'd like to delete all rows where values in this column are strings
我的一列应该只是浮点数。我想删除此列中值为字符串的所有行
You can convert your series to numeric via pd.to_numericand then use pd.Series.notnull. Conversion to floatis required as a separate step to avoid your series reverting to objectdtype.
您可以通过将系列转换为数字pd.to_numeric,然后使用pd.Series.notnull. float需要将转换为单独的步骤,以避免您的系列恢复为objectdtype。
# Data from @EdChum
df = pd.DataFrame({'a': [0.1, 0.5, 'jasdh', 9.0]})
res = df[pd.to_numeric(df['a'], errors='coerce').notnull()]
res['a'] = res['a'].astype(float)
print(res)
a
0 0.1
1 0.5
3 9.0
回答by Karthik V
You can find the data type of a column from the dtype.kindattribute. Something like df[col].dtype.kind. See the numpy docsfor more details. Transpose the dataframe to go from indices to columns.
您可以从dtype.kind属性中找到列的数据类型。类似的东西df[col].dtype.kind。有关更多详细信息,请参阅numpy 文档。将数据框从索引转置为列。
回答by geomars
Assume your data frame is dfand you wanted to ensure that all data in one of the column of your data frame is numeric in specific pandas dtype, e.g float:
假设您的数据框是,df并且您希望确保数据框列之一中的所有数据都是特定的数字pandas dtype,例如float:
df[df.columns[n]] = df[df.columns[n]].apply(pd.to_numeric, errors='coerce').fillna(0).astype(float).dropna()

