使用 NaN 向下舍入 Pandas 数据框列中的值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/35873927/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:49:54  来源:igfitidea点击:

Rounding down values in Pandas dataframe column with NaNs

pythonpandasdataframerounding

提问by user1718097

I have a Pandas dataframe that contains a column of float64 values:

我有一个 Pandas 数据框,其中包含一列 float64 值:

tempDF = pd.DataFrame({ 'id': [12,12,12,12,45,45,45,51,51,51,51,51,51,76,76,76,91,91,91,91],
                        'measure': [3.2,4.2,6.8,5.6,3.1,4.8,8.8,3.0,1.9,2.1,2.4,3.5,4.2,5.2,4.3,3.6,5.2,7.1,6.5,7.3]})

I want to create a new column containing just the integer part. My first thought was to use .astype(int):

我想创建一个仅包含整数部分的新列。我的第一个想法是使用 .astype(int):

tempDF['int_measure'] = tempDF['measure'].astype(int)

This works fine but, as an extra complication, the column I have contains a missing value:

这工作正常,但作为一个额外的复杂因素,我的列包含一个缺失值:

tempDF.ix[10,'measure'] = np.nan

This missing value causes the .astype(int) method to fail with:

这个缺失值导致 .astype(int) 方法失败:

ValueError: Cannot convert NA to integer

I thought I could round down the floats in the column of data. However, the .round(0) function will round to the nearest integer (higher or lower) rather than rounding down. I can't find a function equivalent to ".floor()" that will act on a column of a Pandas dataframe.

我以为我可以舍入数据列中的浮点数。但是, .round(0) 函数将舍入到最接近的整数(更高或更低)而不是向下舍入。我找不到相当于“.floor()”的函数,它可以作用于 Pandas 数据框的列。

Any suggestions?

有什么建议?

回答by Joachim Isaksson

You could just apply numpy.floor;

你可以申请numpy.floor;

import numpy as np

tempDF['int_measure'] = tempDF['measure'].apply(np.floor)

    id  measure  int_measure
0   12      3.2            3
1   12      4.2            4
2   12      6.8            6
...
9   51      2.1            2
10  51      NaN          NaN
11  51      3.5            3
...
19  91      7.3            7

回答by Alexander

You could also try:

你也可以试试:

df.apply(lambda s: s // 1)

Using np.flooris faster, however.

np.floor但是,使用速度更快。

回答by ledawg

The answers here are pretty dated and as of pandas 0.25.2 (perhaps earlier) the error

这里的答案已经过时了,从 pandas 0.25.2(可能更早)开始,错误

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

Which would be

哪个是

df.iloc[:,0] = df.iloc[:,0].astype(int)

for one particular column.

对于一个特定的列。