java 如何在spark java中的数据集上应用map函数
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/41116694/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-03 05:41:07 来源:igfitidea点击:
How to apply map function on dataset in spark java
提问by kumar
My CSVFile:
我的CSV文件:
YEAR,UTILITY_ID,UTILITY_NAME,OWNERSHIP,STATE_CODE,AMR_METERING_RESIDENTIAL,AMR_METERING_COMMERCIAL,AMR_METERING_INDUSTRIAL,AMR_METERING_TRANS,AMR_METERING_TOTAL,AMI_METERING_RESIDENTIAL,AMI_METERING_COMMERCIAL,AMI_METERING_INDUSTRIAL,AMI_METERING_TRANS,AMI_METERING_TOTAL,ENERGY_SERVED_RESIDENTIAL,ENERGY_SERVED_COMMERCIAL,ENERGY_SERVED_INDUSTRIAL,ENERGY_SERVED_TRANS,ENERGY_SERVED_TOTAL
2011,34,City of Abbeville - (SC),M,SC,880,14,,,894,,,,,,,,,,
2011,84,A & N Electric Coop,C,MD,135,25,,,160,,,,,,,,,,
2011,84,A & N Electric Coop,C,VA,31893,2107,0,,34000,,,,,,,,,,
2011,97,Adams Electric Coop,C,IL,8334,190,,,8524,,,,,0,,,,,0
2011,108,Adams-Columbia Electric Coop,C,WI,33524,1788,709,,36021,,,,,,,,,,
2011,118,Adams Rural Electric Coop, Inc,C,OH,7457,20,,,7477,,,,,,,,,,
2011,122,Village of Arcade,M,NY,3560,498,100,,4158,,,,,,,,,,
2011,155,Agralite Electric Coop,C,MN,4383,227,315,,4925,,,,,,,,,,
Here down the Sparkcode to read the CSVfile:
下面是用于读取CSV文件的Spark代码:
public class ReadFile8 {
public static void main(String[] args) throws IOException {
SparkSession session = new SparkSession.Builder().appName("CsvReader").master("local").getOrCreate();
//Data taken by Local System
Dataset<Row> file8Data = session.read().format("com.databricks.spark.csv").option("header", "true").load("file:///home/kumar/Desktop/Eletricaldata/file8_2011.csv");
// Register the DataFrame as a SQL temporary view
file8Data.createOrReplaceTempView("EletricalFile8Data");
file8Data.show();
}
}
How can apply a map function and flatmap function in Sparkusing Java?
如何使用Java在Spark 中应用 map 函数和 flatmap 函数?
回答by Anton Okolnychyi
You can use the following code as an example:
您可以使用以下代码作为示例:
Dataset<Integer> years = file8Data.map((MapFunction<Row, Integer>) row -> row.<Integer>getAs("YEAR"), Encoders.INT());
Dataset<Integer> newYears = years.flatMap((FlatMapFunction<Integer, Integer>) year -> {
return Arrays.asList(year + 1, year + 2).iterator();
}, Encoders.INT());