如何在python数据框中找到列的最大值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/43924686/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to find maximum value of a column in python dataframe
提问by User12345
I have a data frame in pyspark
. In this data frame I have column called id
that is unique.
我在pyspark
. 在这个数据框中,我有一个叫做id
唯一的列。
Now I want to find the maximum
value of the column id
in the data frame.
现在我想在数据框中找到maximum
列的值id
。
I have tried like below
我试过如下
df['id'].max()
But got below error
但得到以下错误
TypeError: 'Column' object is not callable
Please let me know how to find the maximum
value of a column in data frame
请让我知道如何maximum
在数据框中查找列的值
In the answer by @Dadep the link gives the correct answer
在@Dadep 的回答中,链接给出了正确答案
回答by Dadep
if you are using pandas .max()
will work :
如果您使用熊猫.max()
将工作:
>>> df2=pd.DataFrame({'A':[1,5,0], 'B':[3, 5, 6]})
>>> df2['A'].max()
5
Else if it's a spark
dataframe:
否则,如果它是一个spark
数据框:
回答by Haroun Mohammedi
I'm coming from scala, but I do believe that this is also applicable on python.
我来自 Scala,但我相信这也适用于 python。
val max = df.select(max("id")).first()
but you have first import the following :
但您首先导入以下内容:
from pyspark.sql.functions import max
回答by piyush kaushal
The following can be used in pyspark:
可以在pyspark中使用以下内容:
df.select(max("id")).show()
回答by Devendra Swami
You can use the aggregate max as also mentioned in the pyspark documentation link below:
您可以使用聚合最大值,如以下 pyspark 文档链接中所述:
Link : https://spark.apache.org/docs/latest/api/python/pyspark.sql.html?highlight=agg
链接:https: //spark.apache.org/docs/latest/api/python/pyspark.sql.html?highlight=agg
Code:
代码:
row1 = df1.agg({"id": "max"}).collect()[0]