使用 NaN 向下舍入 Pandas 数据框列中的值

Question

提问by user1718097

I have a Pandas dataframe that contains a column of float64 values:

我有一个 Pandas 数据框，其中包含一列 float64 值：

tempDF = pd.DataFrame({ 'id': [12,12,12,12,45,45,45,51,51,51,51,51,51,76,76,76,91,91,91,91],
                        'measure': [3.2,4.2,6.8,5.6,3.1,4.8,8.8,3.0,1.9,2.1,2.4,3.5,4.2,5.2,4.3,3.6,5.2,7.1,6.5,7.3]})

I want to create a new column containing just the integer part. My first thought was to use .astype(int):

我想创建一个仅包含整数部分的新列。我的第一个想法是使用 .astype(int)：

tempDF['int_measure'] = tempDF['measure'].astype(int)

This works fine but, as an extra complication, the column I have contains a missing value:

这工作正常，但作为一个额外的复杂因素，我的列包含一个缺失值：

tempDF.ix[10,'measure'] = np.nan

This missing value causes the .astype(int) method to fail with:

这个缺失值导致 .astype(int) 方法失败：

ValueError: Cannot convert NA to integer

I thought I could round down the floats in the column of data. However, the .round(0) function will round to the nearest integer (higher or lower) rather than rounding down. I can't find a function equivalent to ".floor()" that will act on a column of a Pandas dataframe.

我以为我可以舍入数据列中的浮点数。但是， .round(0) 函数将舍入到最接近的整数（更高或更低）而不是向下舍入。我找不到相当于“.floor()”的函数，它可以作用于 Pandas 数据框的列。

Any suggestions?

有什么建议？

Answer 1

回答by Joachim Isaksson

You could just apply numpy.floor;

你可以申请numpy.floor;

import numpy as np

tempDF['int_measure'] = tempDF['measure'].apply(np.floor)

    id  measure  int_measure
0   12      3.2            3
1   12      4.2            4
2   12      6.8            6
...
9   51      2.1            2
10  51      NaN          NaN
11  51      3.5            3
...
19  91      7.3            7

Answer 2

回答by Alexander

You could also try:

你也可以试试：

df.apply(lambda s: s // 1)

Using np.flooris faster, however.

np.floor但是，使用速度更快。

Answer 3

回答by ledawg

The answers here are pretty dated and as of pandas 0.25.2 (perhaps earlier) the error

这里的答案已经过时了，从 pandas 0.25.2（可能更早）开始，错误

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

Which would be

哪个是

df.iloc[:,0] = df.iloc[:,0].astype(int)

for one particular column.

对于一个特定的列。

使用 NaN 向下舍入 Pandas 数据框列中的值

提问by user1718097

回答by Joachim Isaksson

回答by Alexander

回答by ledawg

相关推荐

最近更新

标签

使用 NaN 向下舍入 Pandas 数据框列中的值

提问by user1718097

回答by Joachim Isaksson

回答by Alexander

回答by ledawg

相关推荐

pandas 向 Python 中的数据框列添加百分号

pandas 如何在matplotlib中以'%H:%M'格式在y轴上绘制时间？

相当于 Python/Pandas 中的 R/ifelse？比较字符串列？

pandas 如何在pandas 2d数据帧中复制numpy 2d数组

相关推荐

最近更新

标签