如何为自定义 Java 对象创建编码器?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/39188504/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to create encoder for custom Java objects?
提问by Pradeep
I am using following class to create bean from Spark Encoders
我正在使用以下类从 Spark 编码器创建 bean
Class OuterClass implements Serializable {
int id;
ArrayList<InnerClass> listofInner;
public int getId() {
return id;
}
public void setId (int num) {
this.id = num;
}
public ArrayList<InnerClass> getListofInner() {
return listofInner;
}
public void setListofInner(ArrayList<InnerClass> list) {
this.listofInner = list;
}
}
public static class InnerClass implements Serializable {
String streetno;
public void setStreetno(String streetno) {
this.streetno= streetno;
}
public String getStreetno() {
return streetno;
}
}
Encoder<OuterClass> outerClassEncoder = Encoders.bean(OuterClass.class);
Dataset<OuterClass> ds = spark.createDataset(Collections.singeltonList(outerclassList), outerClassEncoder)
And I am getting the following error
我收到以下错误
Exception in thread "main" java.lang.UnsupportedOperationException: Cannot infer type for class OuterClass$InnerClass because it is not bean-compliant
How can I implement this type of usecase for spark in java? This worked fine if I remove the inner class. But I need to have an inner class for my use case.
如何在 Java 中为 Spark 实现这种类型的用例?如果我删除内部类,这很好用。但是我需要为我的用例创建一个内部类。
采纳答案by abaghel
Your JavaBean class should have a public no-argument constructor, getter and setters and it should implement Serializable interface. Spark SQL works on valid JavaBean class.
您的 JavaBean 类应该有一个公共的无参数构造函数、getter 和 setter,并且它应该实现 Serializable 接口。Spark SQL 适用于有效的 JavaBean 类。
EDIT: Adding working sample with inner class
编辑:使用内部类添加工作示例
OuterInnerDF.java
外部内部DF.java
package com.abaghel.examples;
import java.util.ArrayList;
import java.util.Collections;
import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Encoder;
import org.apache.spark.sql.Encoders;
import org.apache.spark.sql.SparkSession;
import com.abaghel.examples.OuterClass.InnerClass;
public class OuterInnerDF {
public static void main(String[] args) {
SparkSession spark = SparkSession
.builder()
.appName("OuterInnerDF")
.config("spark.sql.warehouse.dir", "/file:C:/temp")
.master("local[2]")
.getOrCreate();
System.out.println("====> Create DataFrame");
//Outer
OuterClass us = new OuterClass();
us.setId(111);
//Inner
OuterClass.InnerClass ic = new OuterClass.InnerClass();
ic.setStreetno("My Street");
//list
ArrayList<InnerClass> ar = new ArrayList<InnerClass>();
ar.add(ic);
us.setListofInner(ar);
//DF
Encoder<OuterClass> outerClassEncoder = Encoders.bean(OuterClass.class);
Dataset<OuterClass> ds = spark.createDataset(Collections.singletonList(us), outerClassEncoder);
ds.show();
}
}
OuterClass.java
外部类.java
package com.abaghel.examples;
import java.io.Serializable;
import java.util.ArrayList;
public class OuterClass implements Serializable {
int id;
ArrayList<InnerClass> listofInner;
public int getId() {
return id;
}
public void setId(int num) {
this.id = num;
}
public ArrayList<InnerClass> getListofInner() {
return listofInner;
}
public void setListofInner(ArrayList<InnerClass> list) {
this.listofInner = list;
}
public static class InnerClass implements Serializable {
String streetno;
public void setStreetno(String streetno) {
this.streetno = streetno;
}
public String getStreetno() {
return streetno;
}
}
}
Console Output
控制台输出
====> Create DataFrame
16/08/28 18:02:55 INFO CodeGenerator: Code generated in 32.516369 ms
+---+-------------+
| id| listofInner|
+---+-------------+
|111|[[My Street]]|
+---+-------------+