pandas 在列值和常量全局值之间取最小值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/33689714/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:13:37  来源:igfitidea点击:

take minimum between column value and constant global value

pythonnumpypandasdataframeminimum

提问by EdChum

I would like create new column for given dataframe where I calculate minimum between the column value and some global value (in this example 7). so my df has the columns sessionand noteand my desired output column is minValue:

我想为给定的数据框创建新列,在其中计算列值和某个全局值之间的最小值(在本例中为 7)。所以我的DF具有列sessionnote我期望的输出列minValue

session     note     minValue
1       0.726841     0.726841
2       3.163402     3.163402  
3       2.844161     2.844161
4       NaN          NaN

I'm using the built in Python method min:

我正在使用内置的 Python 方法min

df['minValue']=min(7, df['note'])

and I have this error:

我有这个错误:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

回答by EdChum

Use np.minimum:

使用np.minimum

In [341]:
df['MinNote'] = np.minimum(1,df['note'])
df

Out[341]:
   session      note  minValue   MinNote
0        1  0.726841  0.726841  0.726841
1        2  3.163402  3.163402  1.000000
2        3  2.844161  2.844161  1.000000
3        4       NaN       NaN       NaN

Also mindoesn't understand array-like comparisons hence your error

min不能理解阵列喜欢攀比,因此你的错误

回答by Marc Garcia

The preferred way to do this in pandasis to use the Series.clip()method.

执行此操作的首选方法pandas是使用该Series.clip()方法。

In your example:

在你的例子中:

import pandas

df = pandas.DataFrame({'session': [1, 2, 3, 4],
                       'note': [0.726841, 3.163402, 2.844161, float('NaN')]})

df['minVaue'] = df['note'].clip(upper=1.)
df

Will return:

将返回:

       note  session   minVaue
0  0.726841        1  0.726841
1  3.163402        2  1.000000
2  2.844161        3  1.000000
3       NaN        4       NaN

numpy.minimumwill also work, but .clip()has some advantages:

numpy.minimum也可以工作,但.clip()有一些优点:

  • It is more readable
  • You can apply simultaneously lower and upper bounds: df['note'].clip(lower=0., upper=10.)
  • You can pipe it with other methods: df['note'].abs().clip(upper=1.).round()
  • 它更具可读性
  • 您可以同时应用下限和上限: df['note'].clip(lower=0., upper=10.)
  • 你可以用其他方法管道它: df['note'].abs().clip(upper=1.).round()