Python 所选行和列的 Pandas min()

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/25479607/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 20:16:27  来源:igfitidea点击:

Pandas min() of selected row and columns

pythonpandasrowminimumcalculated-columns

提问by yash.trojan.25

I am trying to create a column which contains only the minimum of the one row and a few columns, for example:

我正在尝试创建一个仅包含一行和几列中的最小值的列,例如:

    A0      A1      A2      B0      B1      B2      C0      C1
0   0.84    0.47    0.55    0.46    0.76    0.42    0.24    0.75
1   0.43    0.47    0.93    0.39    0.58    0.83    0.35    0.39
2   0.12    0.17    0.35    0.00    0.19    0.22    0.93    0.73
3   0.95    0.56    0.84    0.74    0.52    0.51    0.28    0.03
4   0.73    0.19    0.88    0.51    0.73    0.69    0.74    0.61
5   0.18    0.46    0.62    0.84    0.68    0.17    0.02    0.53
6   0.38    0.55    0.80    0.87    0.01    0.88    0.56    0.72

Here I am trying to create a column which contains the minimum for each row of columns B0, B1, B2.

在这里,我试图创建一个列,其中包含 B0、B1、B2 列的每一行的最小值。

The output would look like this:

输出将如下所示:

    A0      A1      A2      B0      B1      B2      C0      C1      Minimum
0   0.84    0.47    0.55    0.46    0.76    0.42    0.24    0.75    0.42
1   0.43    0.47    0.93    0.39    0.58    0.83    0.35    0.39    0.39
2   0.12    0.17    0.35    0.00    0.19    0.22    0.93    0.73    0.00
3   0.95    0.56    0.84    0.74    0.52    0.51    0.28    0.03    0.51
4   0.73    0.19    0.88    0.51    0.73    0.69    0.74    0.61    0.51
5   0.18    0.46    0.62    0.84    0.68    0.17    0.02    0.53    0.17
6   0.38    0.55    0.80    0.87    0.01    0.88    0.56    0.72    0.01

Here is part of the code, but it is not doing what I want it to do:

这是代码的一部分,但它没有做我想要它做的事情:

for i in range(0,2):
    df['Minimum'] = df.loc[0,'B'+str(i)].min()

回答by Marius

This is a one-liner, you just need to use the axisargument for minto tell it to work across the columns rather than down:

这是一个单行,您只需要使用axisfor 参数min告诉它跨列而不是向下工作:

df['Minimum'] = df.loc[:, ['B0', 'B1', 'B2']].min(axis=1)

If you need to use this solution for different numbers of columns, you can use a for loop or list comprehension to construct the list of columns:

如果您需要对不同数量的列使用此解决方案,您可以使用 for 循环或列表推导来构建列列表:

n_columns = 2
cols_to_use = ['B' + str(i) for i in range(n_columns)]
df['Minimum'] = df.loc[:, cols_to_use].min(axis=1)

回答by DenPO

For my tasks a universal and flexible approach is the following example:

对于我的任务,通用且灵活的方法是以下示例:

df['Minimum'] = df[['B0', 'B1', 'B2']].apply(lambda x: min(x[0],x[1],x[2]), axis=1)

The target column 'Minimum' is assigned the result of the lambda function based on the selected DF columns['B0', 'B1', 'B2']. Access elements in a function through the function alias and his new Index(if count of elements is more then one). Be sure to specify axis=1, which indicates line-by-line calculations. This is very convenient when you need to make complex calculations. However, I assume that such a solution may be inferior in speed.

根据选定的 DF 列 ['B0', 'B1', 'B2'] 为目标列 'Minimum' 分配 lambda 函数的结果。通过函数别名和他的新索引访问函数中的元素(如果元素数大于一)。一定要指定axis=1,表示逐行计算。当您需要进行复杂的计算时,这非常方便。但是,我认为这样的解决方案在速度上可能较差。

As for the selection of columns, in addition to the 'for' method, I can suggest using a filter like this:

至于列的选择,除了 'for' 方法,我可以建议使用这样的过滤器:

calls_to_use = list(filter(lambda f:'B' in f, df.columns))

literally, a filter is applied to the list of DF columns through a lambda function that checks for the occurrence of the letter 'B'.

从字面上看,过滤器通过检查字母“B”出现的 lambda 函数应用于 DF 列的列表。

after that the first example can be written as follows:

之后第一个例子可以写成如下:

calls_to_use = list(filter(lambda f:'B' in f, df.columns))    
df['Minimum'] = df[calls_to_use].apply(lambda x: min(x), axis=1)

although after pre-selecting the columns, it would be preferable:

尽管在预先选择了列之后,最好是:

df['Minimum'] = df[calls_to_use].min(axis=1)