pandas 将熊猫数据框列转换为数字的更好方法

Question

提问by Sveinn

I have a dataframe with some columns containing data of type object because of some funky data entries (aka a . or whatnot).

由于一些时髦的数据条目（又名 . 或诸如此类），我有一个数据框，其中一些列包含 object 类型的数据。

I have been able to correct this by identifying the object columns and then doing this:

我已经能够通过识别对象列然后执行以下操作来纠正此问题：

obj_cols = df.loc[:, df.dtypes == object]
conv_cols = obj_cols.convert_objects(convert_numeric='force')

This works fine and allows me to run the regression I need, but generates this error:

这工作正常，并允许我运行我需要的回归，但会产生此错误：

FutureWarning: convert_objects is deprecated.

Is there a better way to do this so as to avoid the error? I also tried constructing a lambda function but that didn't work.

有没有更好的方法来做到这一点以避免错误？我也尝试构建一个 lambda 函数，但没有奏效。

Answer 1

回答by Vaishali

Convert_objects is deprecated. Use this instead. You can add parameter errors='coerce' to convert bad non numeric values to NaN.

Convert_objects 已弃用。改用这个。您可以添加参数 errors='coerce' 将错误的非数值转换为 NaN。

conv_cols = obj_cols.apply(pd.to_numeric, errors = 'coerce')

The function will be applied to the whole DataFrame. Columns that can be converted to a numeric type will be converted, while columns that cannot (e.g. they contain non-digit strings or dates) will be left alone.

该函数将应用于整个 DataFrame。可以转换为数字类型的列将被转换，而不能转换的列（例如它们包含非数字字符串或日期）将被保留。

Answer 2

回答by MissBleu

If you have a sample data frame:

如果您有示例数据框：

sales = [{'account': 'Jones LLC', 'Jan': 150, 'Feb': 'f', 'Mar': 140},
     {'account': 'Alpha Co',  'Jan': 'e', 'Feb': 210, 'Mar': 215},
     {'account': 'Blue Inc',  'Jan': 50,  'Feb': 90,  'Mar': 'g' }]
df = pd.DataFrame(sales)

and you want to get rid of the strings in the columns that should be numeric, you can do this with pd.to_numeric

并且你想去掉列中应该是数字的字符串，你可以用 pd.to_numeric 做到这一点

cols = ['Jan', 'Feb', 'Mar']
df[cols] = df[cols].apply(pd.to_numeric, errors='coerce', axis=1)

your new data frame will have NaN in place of the 'wacky' data

您的新数据框将用 NaN 代替“古怪”数据

pandas 将熊猫数据框列转换为数字的更好方法

提问by Sveinn

回答by Vaishali

回答by MissBleu

相关推荐

最近更新

标签

pandas 将熊猫数据框列转换为数字的更好方法

提问by Sveinn

回答by Vaishali

回答by MissBleu

相关推荐

pandas TypeError: 'DataFrame' 对象是可变的，因此它们不能被散列

pandas.to_dict 返回 None 与 nan 混合

在 Pandas 中计算奇数比的更好方法

Pandas：如何根据列表从数据框中删除行？

相关推荐

最近更新

标签