pandas 'GroupedData' 对象在 Spark 数据帧中执行数据透视时没有属性 'show'
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/51820994/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
'GroupedData' object has no attribute 'show' when doing doing pivot in spark dataframe
提问by Nabih Bawazir
I want to pivot a spark dataframe, I refer pyspark documentation, and based on pivotfunction, the clue is .groupBy('name').pivot('name', values=None). Here's my dataset,
我想旋转一个 spark 数据框,我参考 pyspark 文档,并根据pivot功能,线索是.groupBy('name').pivot('name', values=None). 这是我的数据集,
In[75]: spDF.show()
Out[75]:
+-----------+-----------+
|customer_id| name|
+-----------+-----------+
| 25620| MCDonnalds|
| 25620| STARBUCKS|
| 25620| nan|
| 25620| nan|
| 25620| MCDonnalds|
| 25620| nan|
| 25620| MCDonnalds|
| 25620|DUNKINDONUT|
| 25620| LOTTERIA|
| 25620| nan|
| 25620| MCDonnalds|
| 25620|DUNKINDONUT|
| 25620|DUNKINDONUT|
| 25620| nan|
| 25620| nan|
| 25620| nan|
| 25620| nan|
| 25620| LOTTERIA|
| 25620| LOTTERIA|
| 25620| STARBUCKS|
+-----------+-----------+
only showing top 20 rows
And then I try to di pivot the table name
然后我尝试旋转表名
In [96]:
spDF.groupBy('name').pivot('name', values=None)
Out[96]:
<pyspark.sql.group.GroupedData at 0x7f0ad03750f0>
And when I try to show them
当我试图向他们展示
In [98]:
spDF.groupBy('name').pivot('name', values=None).show()
Out [98]:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-98-94354082e956> in <module>()
----> 1 spDF.groupBy('name').pivot('name', values=None).show()
AttributeError: 'GroupedData' object has no attribute 'show'
I don't know why 'GroupedData'can't be shown, what should I do to solve the issue?
我不知道为什么'GroupedData'不能显示,我该怎么做才能解决这个问题?
回答by chromaerror
The pivot()method returns a GroupedDataobject, just like groupBy(). You cannot use show()on a GroupedDataobject without using an aggregate function (such as sum()or even count()) on it before.
该pivot()方法返回一个GroupedData对象,就像groupBy(). 如果之前没有show()在GroupedData对象上使用聚合函数(例如sum()或count()),则不能在该对象上使用。
See this articlefor more information
请参阅这篇文章了解更多信息

