scala 如何在Spark SQL中按列降序排序？

Question

提问by Vedom

I tried df.orderBy("col1").show(10)but it sorted in ascending order. df.sort("col1").show(10)also sorts in descending order. I looked on stackoverflow and the answers I found were all outdated or referred to RDDs. I'd like to use the native dataframe in spark.

我试过了，df.orderBy("col1").show(10)但它是按升序排序的。df.sort("col1").show(10)也按降序排序。我查看了 stackoverflow，发现的答案都已过时或提到 RDDs。我想在 spark 中使用本机数据框。

Answer 1

采纳答案by Vedom

It's in org.apache.spark.sql.DataFramefor sortmethod:

它在org.apache.spark.sql.DataFrameforsort方法中：

df.sort($"col1", $"col2".desc)

Note $and .descinside sortfor the column to sort the results by.

注意$和.desc内部sort用于对结果进行排序的列。

Answer 2

回答by Gabber

You can also sort the column by importing the spark sql functions

您还可以通过导入 spark sql 函数对列进行排序

import org.apache.spark.sql.functions._
df.orderBy(asc("col1"))

Or

或者

import org.apache.spark.sql.functions._
df.sort(desc("col1"))

importing sqlContext.implicits._

导入 sqlContext.implicits._

import sqlContext.implicits._
df.orderBy($"col1".desc)

Or

或者

import sqlContext.implicits._
df.sort($"col1".desc)

Answer 3

回答by Nic Scozzaro

PySpark only

仅 PySpark

I came across this post when looking to do the same in PySpark. The easiest way is to just add the parameter ascending=False:

我在 PySpark 中做同样的事情时遇到了这篇文章。最简单的方法是只添加参数升序=假：

df.orderBy("col1", ascending=False).show(10)

Reference: http://spark.apache.org/docs/2.1.0/api/python/pyspark.sql.html#pyspark.sql.DataFrame.orderBy

参考：http: //spark.apache.org/docs/2.1.0/api/python/pyspark.sql.html#pyspark.sql.DataFrame.orderBy

Answer 4

回答by Nitya Yekkirala

import org.apache.spark.sql.functions.desc

df.orderBy(desc("columnname1"),desc("columnname2"),asc("columnname3"))

Answer 5

回答by Nilesh Shinde

df.sort($"ColumnName".desc).show()

Answer 6

回答by RPaul

In the case of Java:

在 Java 的情况下：

If we use DataFrames, while applying joins (here Inner join), we can sort (in ASC) after selecting distinct elements in each DF as:

如果我们使用DataFrames，在应用连接（这里是内连接）时，我们可以在选择每个 DF 中的不同元素后（在 ASC 中）排序为：

Dataset<Row> d1 = e_data.distinct().join(s_data.distinct(), "e_id").orderBy("salary");

where e_idis the column on which join is applied while sorted by salary in ASC.

e_id在ASC中按薪水排序时应用连接的列在哪里。

Also, we can use Spark SQL as:

此外，我们可以将 Spark SQL 用作：

SQLContext sqlCtx = spark.sqlContext();
sqlCtx.sql("select * from global_temp.salary order by salary desc").show();

where

在哪里

spark -> SparkSession
salary -> GlobalTemp View.

火花-> SparkSession
工资 -> GlobalTemp 视图。

scala 如何在Spark SQL中按列降序排序？

提问by Vedom

采纳答案by Vedom

回答by Gabber

回答by Nic Scozzaro

回答by Nitya Yekkirala

回答by Nilesh Shinde

回答by RPaul

相关推荐

最近更新

标签

scala 如何在Spark SQL中按列降序排序？

提问by Vedom

采纳答案by Vedom

回答by Gabber

回答by Nic Scozzaro

回答by Nitya Yekkirala

回答by Nilesh Shinde

回答by RPaul

相关推荐

scala 如何将环境变量传递给 Jenkins 中的 sbt 测试构建步骤？

scala Spark - 将 CSV 文件加载为 DataFrame？

Scala 中 NonFatal 和 Exception 的区别

Scala spark中的RDD过滤器

相关推荐

最近更新

标签