Python Pandas:在数据框的所有列中找到最大范围

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/24748848/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 05:07:39  来源:igfitidea点击:

Pandas: Find the maximum range in all the columns of dataframe

pythonpandas

提问by HolaGonzalo

I'm new very new to programming, so hopefully I'll ask my question clearly and perhaps you can guide me to the answer.

我是编程的新手,所以希望我能清楚地问我的问题,也许你可以指导我找到答案。

I have a dataframe "x", where the index represents the week of the year, and each column represents a numerical value of a city. I'm attempting to find the column that has the maximum range (ie: maximum value - minimum value). I can imagine this will need a loop to find the maximum and minimum of each column, store this as an object (or as a new row at the bottom perhaps?), and then find the max in that object (or row).

我有一个数据框“x”,其中索引代表一年中的一周,每列代表一个城市的数值。我试图找到具有最大范围(即:最大值 - 最小值)的列。我可以想象这将需要一个循环来查找每列的最大值和最小值,将其存储为一个对象(或者作为底部的新行?),然后在该对象(或行)中找到最大值。

The dataframe looks like this:

数据框如下所示:

        City1 City2 ... CityN 
week
1
2
3
4
...
53

Feedback on etiquette or wording is also appreciated.

对礼仪或措辞的反馈也受到赞赏。

采纳答案by DSM

Something like (df.max() - df.min()).idxmax()should get you a maximum column:

类似的东西(df.max() - df.min()).idxmax()应该给你一个最大的列:

>>> df = pd.DataFrame(np.random.random((5,4)), index=pd.Series(range(1,6), name="week"), columns=["City{}".format(i) for i in range(1,5)])
>>> df
         City1     City2     City3     City4
week                                        
1     0.908549  0.496167  0.220340  0.464060
2     0.429330  0.770133  0.824774  0.155694
3     0.893270  0.980108  0.574897  0.378443
4     0.982410  0.796103  0.080877  0.416432
5     0.444416  0.667695  0.459362  0.898792
>>> df.max() - df.min()
City1    0.553080
City2    0.483941
City3    0.743898
City4    0.743098
dtype: float64
>>> (df.max() - df.min()).idxmax()
'City3'
>>> df[(df.max() - df.min()).idxmax()]
week
1       0.220340
2       0.824774
3       0.574897
4       0.080877
5       0.459362
Name: City3, dtype: float64

If there might be more than one column at maximum range, you'll probably want something like

如果最大范围内可能有不止一列,您可能需要类似的东西

>>> col_ranges = df.max() - df.min()
>>> df.loc[:,col_ranges == col_ranges.max()]
         City3
week          
1     0.220340
2     0.824774
3     0.574897
4     0.080877
5     0.459362

instead.

反而。