Spark DataFrame 和重命名多列(Java)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/33015635/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-11 14:09:40  来源:igfitidea点击:

Spark DataFrame and renaming multiple columns (Java)

javaapache-sparkapache-spark-sql

提问by JiriS

Is there any nicer way to prefix or rename all or multiple columns at the same time of a given SparkSQL DataFramethan calling multiple times dataFrame.withColumnRenamed()?

有没有DataFrame比多次调用更好的方法来为给定 SparkSQL 的所有或多个列同时添加前缀或重命名dataFrame.withColumnRenamed()

An example would be if I want to detect changes (using full outer join). Then I'm left with two DataFrames with the same structure.

一个例子是,如果我想检测更改(使用完全外连接)。然后我剩下两个DataFrame具有相同结构的 s。

采纳答案by Zyoma

I suggest to use the select() method to perform this. In fact withColumnRenamed() method uses select() by itself. Here is example how to rename multiple columns:

我建议使用 select() 方法来执行此操作。实际上 withColumnRenamed() 方法本身使用 select() 。以下是如何重命名多列的示例:

import org.apache.spark.sql.functions._

val someDataframe: DataFrame = ...

val initialColumnNames = Seq("a", "b", "c")
val renamedColumns = initialColumnNames.map(name => col(name).as(s"renamed_$name"))
someDataframe.select(renamedColumns : _*)

回答by lanenok

I heve just found the answer

我刚刚找到了答案

df1_r = df1.select(*(col(x).alias(x + '_df1') for x in df1.columns))

at stackoverflow here(see the end of the accepted answer)

此处的stackoverflow(请参阅已接受答案的结尾)

回答by Devndra

or (a <- 0 to newsales.columns.length - 1) 
{ 
 var new_c = newsales.columns(a).replace('(','_').replace(')',' ').trim  
 newsales_var = newsales.withColumnRenamed(newsales.columns(a),new_c) 
}

回答by Alsace

I think this method can help you.

我觉得这个方法可以帮到你。

public static Dataset<Row> renameDataFrame(Dataset<Row> dataset) {
    for (String column : dataset.columns()) {
        dataset = dataset.withColumnRenamed(column, SystemUtils.underscoreToCamelCase(column));
    }
    return dataset;
}

    public static String underscoreToCamelCase(String underscoreName) {
        StringBuilder result = new StringBuilder();
        if (underscoreName != null && underscoreName.length() > 0) {
            boolean flag = false;
            for (int i = 0; i < underscoreName.length(); i++) {
                char ch = underscoreName.charAt(i);
                if ("_".charAt(0) == ch) {
                    flag = true;
                } else {
                    if (flag) {
                        result.append(Character.toUpperCase(ch));
                        flag = false;
                    } else {
                        result.append(ch);
                    }
                }
            }
        }
        return result.toString();
    }