我的MapReduce作业在Eclipse中组装时运行正常,Eclipse项目中包含所有可能的Hadoop和Hive jar作为依赖项。(这些是单节点本地Hadoop安装附带的jar文件)。
然而,当尝试运行使用Maven项目(见下文)组装的相同程序时,我得到:
Exception in thread "main" java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
当使用以下Maven项目组装程序时,会发生此异常:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.bigdata.hadoop</groupId>
<artifactId>FieldCounts</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>FieldCounts</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>2.2.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.2.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-jobclient</artifactId>
<version>2.2.0</version>
</dependency>
<dependency>
<groupId>org.apache.hive.hcatalog</groupId>
<artifactId>hcatalog-core</artifactId>
<version>0.12.0</version>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>16.0.1</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>2.3.2</version>
<configuration>
<source>${jdk.version}</source>
<target>${jdk.version}</target>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<executions>
<execution>
<goals>
<goal>attached</goal>
</goals>
<phase>package</phase>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
<archive>
<manifest>
<mainClass>com.bigdata.hadoop.FieldCounts</mainClass>
</manifest>
</archive>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
*请告知在哪里以及如何找到兼容的Hadoop jar ?*
[update_1] 我正在运行Hadoop 2.2.0.2.0.6.0-101
正如我在这里发现的:https://github.com/kevinweil/elephant-bird/issues/247
Hadoop 1.0.3: JobContext是一个类
Hadoop 2.0.0: JobContext是一个接口
在我的pom.xml中有三个版本为2.2.0的jar
hadoop-hdfs 2.2.0
hadoop-common 2.2.0
hadoop-mapreduce-client-jobclient 2.2.0
hcatalog-core 0.12.0
唯一的例外是hcatalog-core
,它的版本是0.12.0,我找不到这个jar的最新版本,我需要它!
我如何找到这4个罐子中哪一个产生java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
?
请给我一个解决这个问题的主意。(我看到的唯一解决方案是从源代码编译所有内容!)
[/update_1]
我的MarReduce作业全文:
package com.bigdata.hadoop;
import java.io.IOException;
import java.util.*;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.*;
import org.apache.hadoop.util.*;
import org.apache.hcatalog.mapreduce.*;
import org.apache.hcatalog.data.*;
import org.apache.hcatalog.data.schema.*;
import org.apache.log4j.Logger;
public class FieldCounts extends Configured implements Tool {
public static class Map extends Mapper<WritableComparable, HCatRecord, TableFieldValueKey, IntWritable> {
static Logger logger = Logger.getLogger("com.foo.Bar");
static boolean firstMapRun = true;
static List<String> fieldNameList = new LinkedList<String>();
/**
* Return a list of field names not containing `id` field name
* @param schema
* @return
*/
static List<String> getFieldNames(HCatSchema schema) {
// Filter out `id` name just once
if (firstMapRun) {
firstMapRun = false;
List<String> fieldNames = schema.getFieldNames();
for (String fieldName : fieldNames) {
if (!fieldName.equals("id")) {
fieldNameList.add(fieldName);
}
}
} // if (firstMapRun)
return fieldNameList;
}
@Override
protected void map( WritableComparable key,
HCatRecord hcatRecord,
//org.apache.hadoop.mapreduce.Mapper
//<WritableComparable, HCatRecord, Text, IntWritable>.Context context)
Context context)
throws IOException, InterruptedException {
HCatSchema schema = HCatBaseInputFormat.getTableSchema(context.getConfiguration());
//String schemaTypeStr = schema.getSchemaAsTypeString();
//logger.info("******** schemaTypeStr ********** : "+schemaTypeStr);
//List<String> fieldNames = schema.getFieldNames();
List<String> fieldNames = getFieldNames(schema);
for (String fieldName : fieldNames) {
Object value = hcatRecord.get(fieldName, schema);
String fieldValue = null;
if (null == value) {
fieldValue = "<NULL>";
} else {
fieldValue = value.toString();
}
//String fieldNameValue = fieldName+"."+fieldValue;
//context.write(new Text(fieldNameValue), new IntWritable(1));
TableFieldValueKey fieldKey = new TableFieldValueKey();
fieldKey.fieldName = fieldName;
fieldKey.fieldValue = fieldValue;
context.write(fieldKey, new IntWritable(1));
}
}
}
public static class Reduce extends Reducer<TableFieldValueKey, IntWritable,
WritableComparable, HCatRecord> {
protected void reduce( TableFieldValueKey key,
java.lang.Iterable<IntWritable> values,
Context context)
//org.apache.hadoop.mapreduce.Reducer<Text, IntWritable,
//WritableComparable, HCatRecord>.Context context)
throws IOException, InterruptedException {
Iterator<IntWritable> iter = values.iterator();
int sum = 0;
// Sum up occurrences of the given key
while (iter.hasNext()) {
IntWritable iw = iter.next();
sum = sum + iw.get();
}
HCatRecord record = new DefaultHCatRecord(3);
record.set(0, key.fieldName);
record.set(1, key.fieldValue);
record.set(2, sum);
context.write(null, record);
}
}
public int run(String[] args) throws Exception {
Configuration conf = getConf();
args = new GenericOptionsParser(conf, args).getRemainingArgs();
// To fix Hadoop "META-INFO" (http://stackoverflow.com/questions/17265002/hadoop-no-filesystem-for-scheme-file)
conf.set("fs.hdfs.impl",
org.apache.hadoop.hdfs.DistributedFileSystem.class.getName());
conf.set("fs.file.impl",
org.apache.hadoop.fs.LocalFileSystem.class.getName());
// Get the input and output table names as arguments
String inputTableName = args[0];
String outputTableName = args[1];
// Assume the default database
String dbName = null;
Job job = new Job(conf, "FieldCounts");
HCatInputFormat.setInput(job,
InputJobInfo.create(dbName, inputTableName, null));
job.setJarByClass(FieldCounts.class);
job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);
// An HCatalog record as input
job.setInputFormatClass(HCatInputFormat.class);
// Mapper emits TableFieldValueKey as key and an integer as value
job.setMapOutputKeyClass(TableFieldValueKey.class);
job.setMapOutputValueClass(IntWritable.class);
// Ignore the key for the reducer output; emitting an HCatalog record as
// value
job.setOutputKeyClass(WritableComparable.class);
job.setOutputValueClass(DefaultHCatRecord.class);
job.setOutputFormatClass(HCatOutputFormat.class);
HCatOutputFormat.setOutput(job,
OutputJobInfo.create(dbName, outputTableName, null));
HCatSchema s = HCatOutputFormat.getTableSchema(job);
System.err.println("INFO: output schema explicitly set for writing:"
+ s);
HCatOutputFormat.setSchema(job, s);
return (job.waitForCompletion(true) ? 0 : 1);
}
public static void main(String[] args) throws Exception {
String classpath = System.getProperty("java.class.path");
//System.out.println("*** CLASSPATH: "+classpath);
int exitCode = ToolRunner.run(new FieldCounts(), args);
System.exit(exitCode);
}
}
和复杂键的类:
package com.bigdata.hadoop;
import java.io.DataInput;
import java.io.DataOutput;
import java.io.IOException;
import org.apache.hadoop.io.WritableComparable;
import com.google.common.collect.ComparisonChain;
public class TableFieldValueKey implements WritableComparable<TableFieldValueKey> {
public String fieldName;
public String fieldValue;
public TableFieldValueKey() {} //must have a default constructor
//
public void readFields(DataInput in) throws IOException {
fieldName = in.readUTF();
fieldValue = in.readUTF();
}
public void write(DataOutput out) throws IOException {
out.writeUTF(fieldName);
out.writeUTF(fieldValue);
}
public int compareTo(TableFieldValueKey o) {
return ComparisonChain.start().compare(fieldName, o.fieldName)
.compare(fieldValue, o.fieldValue).result();
}
}
Hadoop经历了从Hadoop 1.0
到Hadoop 2.0
的巨大代码重构。一个副作用针对Hadoop 1.0编译的代码与Hadoop 2.0不兼容,反之亦然。然而,源代码大多是兼容的,因此只需要重新编译代码与目标Hadoop分布。
异常" Found interface X, but class was expected
"是非常常见的,当你运行在Hadoop 2.0上为Hadoop 1.0编译的代码,反之亦然。
您可以找到集群中使用的正确hadoop版本,然后在pom.xml文件中指定该hadoop版本。
你需要重新编译"hcatalog-core"以支持Hadoop 2.0.0。目前"hcatalog-core"只支持Hadoop 1.0
显然,您的Hadoop和Hive版本之间存在版本不兼容。你需要升级(或降级)你的Hadoop版本或Hive版本。
这是由于Hadoop 1和Hadoop 2之间不兼容。
查找如下条目
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
<version>1.2.1</version>
</dependency>
在您的pom.xml。这些定义了要使用的hadoop版本。根据您的需求更改或删除它们
我也遇到过这个问题。正在尝试使用HCatMultipleInputs与hive-hcatalog-core-0.13.0.jar。我们使用的是hadoop 2.5.1.
下面的代码修改帮助我解决了这个问题:
<>之前//JobContext ctx = new JobContext(conf, JobContext . getjobid ());JobContext ctx = new Job(conf);