java Spark:JavaRDD<Tuple2> 到 JavaPairRDD<>

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/27024169/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-02 11:05:32  来源:igfitidea点击:

Spark: JavaRDD<Tuple2> to JavaPairRDD<>

javamapreduceapache-spark

提问by YuliaSh.

I have a JavaRDD<Tuple2<String, String>>and need to transform it to JavaPairRDD<String, String>. Currently I am doing it by simply writing map function that just returns the input tuple as is. But I wonder if there is a better way?

我有一个JavaRDD<Tuple2<String, String>>并且需要将它转换为JavaPairRDD<String, String>. 目前,我通过简单地编写只按原样返回输入元组的 map 函数来实现。但我想知道是否有更好的方法?

回答by YuliaSh.

JavaPairRDD.fromJavaRDD(rdd) is one of solutions

JavaPairRDD.fromJavaRDD(rdd) 是解决方案之一

回答by Michal ?izmazia

For reverse conversion, this seems to work:

对于反向转换,这似乎有效:

JavaRDD.fromRDD(JavaPairRDD.toRDD(rdd), rdd.classTag());

回答by Rajeev Rathor

Try this to transform JavaRDD into JavaPairRDD. For me It is working perfectly.

试试这个把 JavaRDD 转换成 JavaPairRDD。对我来说它工作得很好。

JavaRDD<Sensor> sensorRdd = lines.map(new SensorData()).cache();
// transform data into javaPairRdd
JavaPairRDD<Integer, Sensor> deviceRdd = sensorRdd.mapToPair(new PairFunction<Sensor, Integer, Sensor>() {   
    public Tuple2<Integer, Sensor> call(Sensor sensor) throws Exception {
        Tuple2<Integer, Sensor>  tuple = new Tuple2<Integer, Sensor>(Integer.parseInt(sensor.getsId().trim()), sensor);
        return tuple;
    }
});

回答by Damir Olejar

Try this example:

试试这个例子:

JavaRDD<Tuple2<Integer, String>> mutate = mutateFunction(rdd_world); //goes to a method that generates the RDD with a Tuple2 from a rdd_world RDD
JavaPairRDD<Integer,  String> pairs = JavaPairRDD.fromJavaRDD(mutate);

回答by preeze

Alternatively you can call mapToPair(..)on your instance of org.apache.spark.api.java.JavaRDD.

或者,您可以调用mapToPair(..)您的org.apache.spark.api.java.JavaRDD.