如何在python数据框中找到列的最大值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/43924686/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 23:32:00  来源:igfitidea点击:

How to find maximum value of a column in python dataframe

pythondataframepyspark

提问by User12345

I have a data frame in pyspark. In this data frame I have column called idthat is unique.

我在pyspark. 在这个数据框中,我有一个叫做id唯一的列。

Now I want to find the maximumvalue of the column idin the data frame.

现在我想在数据框中找到maximum列的值id

I have tried like below

我试过如下

df['id'].max()

But got below error

但得到以下错误

TypeError: 'Column' object is not callable

Please let me know how to find the maximumvalue of a column in data frame

请让我知道如何maximum在数据框中查找列的值

In the answer by @Dadep the link gives the correct answer

在@Dadep 的回答中,链接给出了正确答案

回答by Dadep

if you are using pandas .max()will work :

如果您使用熊猫.max()将工作:

>>> df2=pd.DataFrame({'A':[1,5,0], 'B':[3, 5, 6]})
>>> df2['A'].max()
5

Else if it's a sparkdataframe:

否则,如果它是一个spark数据框:

Best way to get the max value in a Spark dataframe column

在 Spark 数据框列中获取最大值的最佳方法

回答by Haroun Mohammedi

I'm coming from scala, but I do believe that this is also applicable on python.

我来自 Scala,但我相信这也适用于 python。

val max = df.select(max("id")).first()

but you have first import the following :

但您首先导入以下内容:

from pyspark.sql.functions import max

回答by piyush kaushal

The following can be used in pyspark:

可以在pyspark中使用以下内容:

df.select(max("id")).show()

回答by Devendra Swami

You can use the aggregate max as also mentioned in the pyspark documentation link below:

您可以使用聚合最大值,如以下 pyspark 文档链接中所述:

Link : https://spark.apache.org/docs/latest/api/python/pyspark.sql.html?highlight=agg

链接:https: //spark.apache.org/docs/latest/api/python/pyspark.sql.html?highlight=agg

Code:

代码:

row1 = df1.agg({"id": "max"}).collect()[0]