pandas 'GroupedData' 对象在 Spark 数据帧中执行数据透视时没有属性 'show'
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/51820994/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
'GroupedData' object has no attribute 'show' when doing doing pivot in spark dataframe
提问by Nabih Bawazir
I want to pivot a spark dataframe, I refer pyspark documentation, and based on pivot
function, the clue is .groupBy('name').pivot('name', values=None)
. Here's my dataset,
我想旋转一个 spark 数据框,我参考 pyspark 文档,并根据pivot
功能,线索是.groupBy('name').pivot('name', values=None)
. 这是我的数据集,
In[75]: spDF.show()
Out[75]:
+-----------+-----------+
|customer_id| name|
+-----------+-----------+
| 25620| MCDonnalds|
| 25620| STARBUCKS|
| 25620| nan|
| 25620| nan|
| 25620| MCDonnalds|
| 25620| nan|
| 25620| MCDonnalds|
| 25620|DUNKINDONUT|
| 25620| LOTTERIA|
| 25620| nan|
| 25620| MCDonnalds|
| 25620|DUNKINDONUT|
| 25620|DUNKINDONUT|
| 25620| nan|
| 25620| nan|
| 25620| nan|
| 25620| nan|
| 25620| LOTTERIA|
| 25620| LOTTERIA|
| 25620| STARBUCKS|
+-----------+-----------+
only showing top 20 rows
And then I try to di pivot the table name
然后我尝试旋转表名
In [96]:
spDF.groupBy('name').pivot('name', values=None)
Out[96]:
<pyspark.sql.group.GroupedData at 0x7f0ad03750f0>
And when I try to show them
当我试图向他们展示
In [98]:
spDF.groupBy('name').pivot('name', values=None).show()
Out [98]:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-98-94354082e956> in <module>()
----> 1 spDF.groupBy('name').pivot('name', values=None).show()
AttributeError: 'GroupedData' object has no attribute 'show'
I don't know why 'GroupedData'
can't be shown, what should I do to solve the issue?
我不知道为什么'GroupedData'
不能显示,我该怎么做才能解决这个问题?
回答by chromaerror
The pivot()
method returns a GroupedData
object, just like groupBy()
. You cannot use show()
on a GroupedData
object without using an aggregate function (such as sum()
or even count()
) on it before.
该pivot()
方法返回一个GroupedData
对象,就像groupBy()
. 如果之前没有show()
在GroupedData
对象上使用聚合函数(例如sum()
或count()
),则不能在该对象上使用。
See this articlefor more information
请参阅这篇文章了解更多信息