从 txt 文件读取并写入 HBase



我正在尝试从 txt 文件中读取并写入 HBase。

Job Class
   Job job = new Job(conf, "HWriterJob");
        job.setJarByClass(HWriterJob.class);
        FileInputFormat.setInputPaths(job, new Path(otherArgs[0]));
        job.setMapperClass(TokenizerMapper.class);
        job.setOutputKeyClass(ImmutableBytesWritable.class);
        job.setOutputValueClass(Put.class);
        TableMapReduceUtil.initTableReducerJob(table,null,job);
Mapper Class
@Override
    public void map(Text key, Text value, Context context)
            throws IOException, InterruptedException {
        String line = value.toString();
        StringTokenizer st = new StringTokenizer(line, "|");
        String result[] = new String[st.countTokens()];
        int i = 0;
        while (st.hasMoreTokens()) {
            result[i] = st.nextToken();
            i++;
        }
        Map<ImmutableBytesWritable,Put> resultSet = writeToHBase(result);
        for (Map.Entry<ImmutableBytesWritable,Put> entry : resultSet.entrySet()) {
            context.write(new Text(entry.getValue().getRow()), entry.getValue());
        }
    }
Reducer Class
public void reduce(Text key, Iterable<Put> values, Context context)
            throws IOException, InterruptedException {
        for (Put val : values) {
            context.write(key, val);
        }
    }

但我这样做并不成功。

我收到以下错误java.lang.ClassCastException:org.apache.hadoop.io.LongWritable无法强制转换为org.apache.hadoop.io.Text

显然,

MR默认为LongWwritetable作为MapOutputKeyClass,在您的情况下应该是文本,因此会出现错误。

尝试设置 job.setMapOutputKeyClass(Text.class) 和还要适当地设置 job.setMapOutputValueClass。

            @Job Class
            Job job = new Job(conf, "HWriterJob");
    job.setJarByClass(HWriterJob.class);
    FileInputFormat.setInputPaths(job, new Path(otherArgs[0]));
    job.setMapperClass(TokenizerMapper.class);
    TextInputFormat.setInputPaths(job, new Path(args[0]));
    job.setInputFormatClass(TextInputFormat.class);
    job.setOutputKeyClass(ImmutableBytesWritable.class);
        job.setOutputValueClass(Put.class);
    job.setOutputFormatClass(TableOutputFormat.class);
    job.setNumReduceTasks(0);
    System.exit(job.waitForCompletion(true) ? 0 : 1);
            @Mapper 
            Map<ImmutableBytesWritable,Put> resultSet = writeToHBase(result);
    for (Map.Entry<ImmutableBytesWritable,Put> entry : resultSet.entrySet()) {
        context.write(entry.getKey(), entry.getValue());
    }

最新更新