在 Pandas DataFrame 中设置最大值(上限)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/40836208/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 02:31:21  来源:igfitidea点击:

Set maximum value (upper bound) in pandas DataFrame

pythonpandasdataframemax

提问by elPastor

I'm trying to set a maximum value of a pandas DataFrame column. For example:

我正在尝试设置 pandas DataFrame 列的最大值。例如:

my_dict = {'a':[10,12,15,17,19,20]}
df = pd.DataFrame(my_dict)

df['a'].set_max(15)

would yield:

会产生:

    a
0   10
1   12
2   15
3   15
4   15
5   15

But it doesn't.

但事实并非如此。

There are a million solutions to findthe maximum value, but nothing to setthe maximum value... at least that I can find.

有一百万个解决方案可以找到最大值,但没有设置最大值......至少我能找到。

I could iterate through the list, but I suspect there is a faster way to do it with pandas. My lists will be significantly longer and thus I would expect iteration to take relatively longer amount of time. Also, I'd like whatever solution to be able to handle NaN.

我可以遍历列表,但我怀疑有一种更快的方法可以用 Pandas 来完成。我的列表会更长,因此我希望迭代需要相对更长的时间。另外,我想要任何能够处理NaN.

回答by Psidom

I suppose you can do:

我想你可以这样做:

maxVal = 15
df['a'].where(df['a'] <= maxVal, maxVal)      # where replace values with other when the 
                                              # condition is not satisfied

#0    10
#1    12
#2    15
#3    15
#4    15
#5    15
#Name: a, dtype: int64

Or:

或者:

df['a'][df['a'] >= maxVal] = maxVal

回答by tommy.carstensen

You can use clip.

您可以使用剪辑

Apply to all columns of the data frame:

应用于数据框的所有列:

df.clip(upper=15)

Otherwise apply to selected columns as seen here:

否则,适用于选择列,看到这里

df.clip(upper=pd.Series({'a': 15}), axis=1)

回答by cs95

numpy.clipis a good, fast alternative.

numpy.clip是一个很好的、快速的替代方案。

df

    a
0  10
1  12
2  15
3  17
4  19
5  20

np.clip(df['a'], a_max=15, a_min=None)

0    10
1    12
2    15
3    15
4    15
5    15
Name: a, dtype: int64

# Or,
np.clip(df['a'].to_numpy(), a_max=15, a_min=None)
# array([10, 12, 15, 15, 15, 15])


From v0.21 onwards, you can also use DataFrame.clip_upper.

从 v0.21 开始,您还可以使用DataFrame.clip_upper.

Note
This method (along with clip_lower) has been deprecated from v0.24 and will be removed in a future version.

注意
此方法(连同clip_lower)已从 v0.24 中弃用,并将在未来版本中删除。

df.clip_upper(15)
# Or, for a specific column,
df['a'].clip_upper(15)

    a
0  10
1  12
2  15
3  15
4  15
5  15

In similar vein, if you only want to set the lower bound, use DataFrame.clip_lower. These methods are also avaliable on Seriesobjects.

同样,如果您只想设置下限,请使用DataFrame.clip_lower. 这些方法也可用于Series对象。