java hadoop map reduce job with HDFS input and HBASE output

Question

提问by jmventar

I'm new on hadoop. I have a MapReduce job which is supposed to get an input from Hdfs and write the output of the reducer to Hbase. I haven't found any good example.

我是 hadoop 的新手。我有一个 MapReduce 作业，它应该从 Hdfs 获取输入并将减速器的输出写入 Hbase。我还没有找到任何好的例子。

Here's the code, the error runing this example is Type mismatch in map, expected ImmutableBytesWritable recieved IntWritable.

这是代码，运行此示例的错误是映射中的类型不匹配，预期 ImmutableBytesWritable 收到 IntWritable。

Mapper Class

映射器类

public static class AddValueMapper extends Mapper < LongWritable,
 Text, ImmutableBytesWritable, IntWritable > {  

  /* input <key, line number : value, full line>
   *  output <key, log key : value >*/  
public void map(LongWritable key, Text value, 
     Context context)throws IOException, 
     InterruptedException {
  byte[] key;
  int value, pos = 0;
  String line = value.toString();
  String p1 , p2 = null;
  pos = line.indexOf("=");

   //Key part
   p1 = line.substring(0, pos);
   p1 = p1.trim();
   key = Bytes.toBytes(p1);   

   //Value part
   p2 = line.substring(pos +1);
   p2 = p2.trim();
   value = Integer.parseInt(p2);

   context.write(new ImmutableBytesWritable(key),new IntWritable(value));
  }
}

Reducer Class

减速机类

public static class AddValuesReducer extends TableReducer<
  ImmutableBytesWritable, IntWritable, ImmutableBytesWritable> {

  public void reduce(ImmutableBytesWritable key, Iterable<IntWritable> values, 
   Context context) throws IOException, InterruptedException {

         long total =0;
         // Loop values
         while(values.iterator().hasNext()){
           total += values.iterator().next().get();
         }
         // Put to HBase
         Put put = new Put(key.get());
         put.add(Bytes.toBytes("data"), Bytes.toBytes("total"),
           Bytes.toBytes(total));
         Bytes.toInt(key.get()), total));
            context.write(key, put);
        }
    }

I had a similar job only with HDFS and works fine.

我只在 HDFS 上有过类似的工作，并且工作正常。

Edited 18-06-2013. The college project finished successfully two years ago. For job configuration (driver part) check correct answer.

2013 年 6 月 18 日编辑。两年前大学项目顺利完成。对于作业配置（驱动程序部分），请检查正确答案。

Answer 1

采纳答案by saurabh shashank

Here is the code which will solve your problem

这是可以解决您的问题的代码

Driver

司机

HBaseConfiguration conf =  HBaseConfiguration.create();
Job job = new Job(conf,"JOB_NAME");
    job.setJarByClass(yourclass.class);
    job.setMapperClass(yourMapper.class);
    job.setMapOutputKeyClass(Text.class);
    job.setMapOutputValueClass(Intwritable.class);
    FileInputFormat.setInputPaths(job, new Path(inputPath));
    TableMapReduceUtil.initTableReducerJob(TABLE,
            yourReducer.class, job);
    job.setReducerClass(yourReducer.class);
            job.waitForCompletion(true);

Mapper&Reducer

class yourMapper extends Mapper<LongWritable, Text, Text,IntWritable> {
//@overide map()
 }

class yourReducer
        extends
        TableReducer<Text, IntWritable, 
        ImmutableBytesWritable>
{
//@override reduce()
}

Answer 2

回答by Prasad D

The best and fastestway to BulkLoad data in HBase is use of HFileOutputFormatand CompliteBulkLoadutility.

在 HBase 中批量加载数据的最好和最快的方法是使用HFileOutputFormat和CompliteBulkLoad实用程序。

You will find a sample code here:

您将在此处找到示例代码：

Hope this will be useful :)

希望这将是有用的:)

Answer 3

回答by David

Not sure why the HDFS version works: normaly you have to set the input format for the job, and FileInputFormat is an abstract class. Perhaps you left some lines out? such as

不知道为什么 HDFS 版本有效：通常您必须为作业设置输入格式，而 FileInputFormat 是一个抽象类。也许你遗漏了一些台词？如

job.setInputFormatClass(TextInputFormat.class);

Answer 4

回答by badri

 public void map(LongWritable key, Text value, 
 Context context)throws IOException, 
 InterruptedException {

change this to immutableBytesWritable, intwritable.

将其更改为immutableBytesWritable, intwritable。

I am not sure..hope it works

我不确定..希望它有效

java hadoop map reduce job with HDFS input and HBASE output

提问by jmventar

采纳答案by saurabh shashank

Driver

司机

Mapper&Reducer

Mapper&Reducer

回答by Prasad D

回答by David

回答by badri

相关推荐

最近更新

标签

java hadoop map reduce job with HDFS input and HBASE output

提问by jmventar

采纳答案by saurabh shashank

Driver

司机

Mapper&Reducer

Mapper&Reducer

回答by Prasad D

回答by David

回答by badri

相关推荐

在 Java 中生成 8 字节数字

java 如何计算Java中两个事件之间的时间差？

java 无需按 Enter 即可通过 telnet 发送数据

java 将日期增加 1 并循环至月底

相关推荐

最近更新

标签