Java Mapreduce wordcount 作业中未找到类异常
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/21373550/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Class Not Found Exception in Mapreduce wordcount job
提问by lucifer
i am trying to run a wordcount job in hadoop.but always getting a class not found exception.I am posting the class that i wrote and the command i using to run the job
我正在尝试在 hadoop 中运行 wordcount 作业。但总是得到一个找不到类的异常。我正在发布我编写的类和我用来运行该作业的命令
import java.io.IOException;
import java.util.*;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.*;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
public class WordCount {
public static class Map extends Mapper<LongWritable, Text, Text, IntWritable> {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) {
word.set(tokenizer.nextToken());
context.write(word, one);
}
}
}
public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {
public void reduce(Text key, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
context.write(key, new IntWritable(sum));
}
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = new Job(conf, "WordCount");
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.waitForCompletion(true);
job.setJarByClass(WordCount.class);
}
}
the wordcount.jar is exported to my downloads folder And this is the command i use to run the job
wordcount.jar 被导出到我的下载文件夹 这是我用来运行作业的命令
jeet@jeet-Vostro-2520:~/Downloads$ hadoop jar wordcount.jar org.gamma.WordCount /user/jeet/getty/gettysburg.txt /user/jeet/getty/out
in this case my mapreduce job is started but it is ending in the middle of the process.Printing the exception tree.
在这种情况下,我的 mapreduce 作业已启动,但它在进程中间结束。打印异常树。
14/01/27 13:16:02 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
14/01/27 13:16:02 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
14/01/27 13:16:02 INFO input.FileInputFormat: Total input paths to process : 1
14/01/27 13:16:02 INFO util.NativeCodeLoader: Loaded the native-hadoop library
14/01/27 13:16:02 WARN snappy.LoadSnappy: Snappy native library not loaded
14/01/27 13:16:03 INFO mapred.JobClient: Running job: job_201401271247_0001
14/01/27 13:16:04 INFO mapred.JobClient: map 0% reduce 0%
14/01/27 13:16:11 INFO mapred.JobClient: Task Id : attempt_201401271247_0001_m_000000_0, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: org.gamma.WordCount$Map
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:849)
at org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:199)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:719)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.ClassNotFoundException: org.gamma.WordCount$Map
at java.net.URLClassLoader.run(URLClassLoader.java:366)
at java.net.URLClassLoader.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:802)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:847)
... 8 more
14/01/27 13:16:16 INFO mapred.JobClient: Task Id : attempt_201401271247_0001_m_000000_1, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: org.gamma.WordCount$Map
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:849)
at org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:199)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:719)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.ClassNotFoundException: org.gamma.WordCount$Map
at java.net.URLClassLoader.run(URLClassLoader.java:366)
at java.net.URLClassLoader.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:802)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:847)
... 8 more
14/01/27 13:16:20 INFO mapred.JobClient: Task Id : attempt_201401271247_0001_m_000000_2, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException: org.gamma.WordCount$Map
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:849)
at org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:199)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:719)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.ClassNotFoundException: org.gamma.WordCount$Map
at java.net.URLClassLoader.run(URLClassLoader.java:366)
at java.net.URLClassLoader.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:802)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:847)
... 8 more
14/01/27 13:16:26 INFO mapred.JobClient: Job complete: job_201401271247_0001
14/01/27 13:16:26 INFO mapred.JobClient: Counters: 7
14/01/27 13:16:26 INFO mapred.JobClient: Job Counters
14/01/27 13:16:26 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=20953
14/01/27 13:16:26 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
14/01/27 13:16:26 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
14/01/27 13:16:26 INFO mapred.JobClient: Launched map tasks=4
14/01/27 13:16:26 INFO mapred.JobClient: Data-local map tasks=4
14/01/27 13:16:26 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0
14/01/27 13:16:26 INFO mapred.JobClient: Failed map tasks=1
somebody please please help i think i am very close of it
回答by Thamme Gowda
I suspect this :
我怀疑这个:
14/01/27 13:16:02 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
14/01/27 13:16:02 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
I got the same error when using CDH4.6 and it got solved after resolving the above warning.
我在使用 CDH4.6 时遇到了同样的错误,并在解决了上述警告后得到了解决。
回答by KT_admin
Try adding this
尝试添加这个
Job job = new Job(conf, "wordcount");
job.setJarByClass(WordCount.class);
回答by Kalu
You have to add this method
你必须添加这个方法
job.setJarByClass(WordCount.class);
job.setJarByClass(WordCount.class);
before invoking the method
在调用方法之前
job.waitForCompletion(true);
job.waitForCompletion(true);
As like following:
如下所示:
job.setJarByClass(WordCount.class);
job.waitForCompletion(true);
job.setJarByClass(WordCount.class);
job.waitForCompletion(true);
回答by spiralmoon
try job.setJar("wordcount.jar");
, where wordcount.jar is the jar file that you are going to package to.
This method works for me, but NOT setJarByClass
!
try job.setJar("wordcount.jar");
,其中 wordcount.jar 是您要打包到的 jar 文件。这种方法对我有用,但不适用setJarByClass
!
回答by Kumar Basapuram
Use The below code for resolving this Problem. job.setJarByClass(DriverClass.class);
使用下面的代码来解决这个问题。job.setJarByClass(DriverClass.class);
回答by Nagaraj Vittal
I also got the same issue and fixed it by removing same WordCount.class file in the same directory from where I am executing my jar. Looks like it is taking the class out side the jar. Try
我也遇到了同样的问题,并通过从执行 jar 的同一目录中删除相同的 WordCount.class 文件来修复它。看起来它正在将课程带出罐子。尝试
回答by Victor
job.setJarByClass(WordCount.class); job.waitForCompletion(true);
job.setJarByClass(WordCount.class); job.waitForCompletion(true);
回答by ??V??? Rā????
Though MapReduce program is parallel processing. Mapper, Combiner and Reducer class has sequence flow. Have to wait for completing each flow depends on other class so need job.waitForCompletion(true);
But It must to set input and output path before starting Mapper, Combiner and Reducer class. Reference
虽然 MapReduce 程序是并行处理的。Mapper、Combiner 和 Reducer 类具有序列流。必须等待完成每个流依赖于其他类所以需要job.waitForCompletion(true);
但它必须在启动 Mapper、Combiner 和 Reducer 类之前设置输入和输出路径。参考
Change your code like this:
像这样改变你的代码:
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = new Job(conf, "WordCount");
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.setJarByClass(WordCount.class);
job.waitForCompletion(true);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
}
I hope this will works.
我希望这会奏效。
回答by vickyi
I got this to work using JobConf#setJar(String)
我让这个工作使用JobConf#setJar(String)