我正在尝试实现Book Hadoop In Action中给出的一个用例,但我不是编译代码。我是Java的新手,所以无法理解错误背后的确切原因。
有趣的是,另一段使用相同的类和方法的编码已成功编译。
hadoop@hadoopnode1:~/hadoop-0.20.2/playground/src$ javac -classpath /home/hadoop/hadoop-0.20.2/hadoop-0.20.2-core.jar:/home/hadoop/hadoop-0.20.2/lib/commons-cli-1.2.jar:/home/hadoop/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar -d ../classes DataJoin2.java
DataJoin2.java:49: cannot find symbol
symbol : constructor TaggedWritable(org.apache.hadoop.io.Text)
location: class DataJoin2.TaggedWritable
TaggedWritable retv = new TaggedWritable((Text) value);
^
DataJoin2.java:69: cannot find symbol
symbol : constructor TaggedWritable(org.apache.hadoop.io.Text)
location: class DataJoin2.TaggedWritable
TaggedWritable retv = new TaggedWritable(new Text(joinedStr));
^
DataJoin2.java:113: setMapperClass(java.lang.Class<? extends org.apache.hadoop.mapreduce.Mapper>) in org.apache.hadoop.mapreduce.Job cannot be applied to (java.lang.Class<DataJoin2.MapClass>)
job.setMapperClass(MapClass.class);
^
DataJoin2.java:114: setReducerClass(java.lang.Class<? extends org.apache.hadoop.mapreduce.Reducer>) in org.apache.hadoop.mapreduce.Job cannot be applied to (java.lang.Class<DataJoin2.Reduce>)
job.setReducerClass(Reduce.class);
^
4 errors
----------------法典----------------------
import java.io.DataInput;
import java.io.DataOutput;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapred.KeyValueTextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.util.GenericOptionsParser;
// DataJoin Classes
import org.apache.hadoop.contrib.utils.join.DataJoinMapperBase;
import org.apache.hadoop.contrib.utils.join.TaggedMapOutput;
import org.apache.hadoop.contrib.utils.join.DataJoinReducerBase;
import org.apache.hadoop.io.Writable;
import org.apache.hadoop.io.WritableComparable;
public class DataJoin2
{
public static class MapClass extends DataJoinMapperBase
{
protected Text generateInputTag(String inputFile)
{
String datasource = inputFile.split("-")[0];
return new Text(datasource);
}
protected Text generateGroupKey(TaggedMapOutput aRecord)
{
String line = ((Text) aRecord.getData()).toString();
String[] tokens = line.split(",");
String groupKey = tokens[0];
return new Text(groupKey);
}
protected TaggedMapOutput generateTaggedMapOutput(Object value)
{
TaggedWritable retv = new TaggedWritable((Text) value);
retv.setTag(this.inputTag);
return retv;
}
} // End of class MapClass
public static class Reduce extends DataJoinReducerBase
{
protected TaggedMapOutput combine(Object[] tags, Object[] values)
{
if (tags.length < 2) return null;
String joinedStr = "";
for (int i=0;i<values.length;i++)
{
if (i>0) joinedStr += ",";
TaggedWritable tw = (TaggedWritable) values[i];
String line = ((Text) tw.getData()).toString();
String[] tokens = line.split(",",2);
joinedStr += tokens[1];
}
TaggedWritable retv = new TaggedWritable(new Text(joinedStr));
retv.setTag((Text) tags[0]);
return retv;
}
} // End of class Reduce
public static class TaggedWritable extends TaggedMapOutput
{
private Writable data;
public TaggedWritable()
{
this.tag = new Text("");
this.data = data;
}
public Writable getData()
{
return data;
}
public void write(DataOutput out) throws IOException
{
this.tag.write(out);
this.data.write(out);
}
public void readFields(DataInput in) throws IOException
{
this.tag.readFields(in);
this.data.readFields(in);
}
} // End of class TaggedWritable
public static void main(String[] args) throws Exception
{
Configuration conf = new Configuration();
String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
if (otherArgs.length != 2) {
System.err.println("Usage: DataJoin2 <in> <out>");
System.exit(2);
}
Job job = new Job(conf, "DataJoin");
job.setJarByClass(DataJoin2.class);
job.setMapperClass(MapClass.class);
job.setReducerClass(Reduce.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(TaggedWritable.class);
FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
错误消息没有任何歧义。 它告诉您您没有为TaggedWritable
提供构造函数,该构造函数采用 Text
类型的参数。您只在发布的代码中显示无参数构造函数。
对于前两条错误消息,编译器错误清楚地告诉您,您没有接受类型为 Text
的参数的 TaggedWritable
构造函数。在我看来,您正在TaggedWritable
作为Writable
添加标签的包装器,因此我建议使用以下方法添加构造函数:
public TaggedWritable(Writable data) {
this.tag = new Text("");
this.data = data;
}
事实上,正如你写的那样,这一行
this.data = data;
只是将data
重新分配给自己,所以我很确定你打算有一个名为 data
的构造函数参数。请参阅我上面的推理,了解为什么我认为您应该将其Writable
而不是Text
.由于Text
实现了Writable
,这将解决您的前两条错误消息。
但是,您需要保留默认的无参数构造函数。这是因为Hadoop将使用反射来实例化实例Writable
值,因为它在mapreduce阶段之间通过网络序列化它们。我认为对于默认的 no-arg 构造函数来说,您在这里有点混乱:
public TaggedWritable() {
this.tag = new Text("");
}
我认为这是一团糟的原因是,如果您不为TaggedWritable.data
包装的Writable
值分配一个有效实例,那么在TaggedWritable.readFields(DataInput)
中调用this.data.readFields(in)
时,您将获得NullPointerException
。由于它是一个通用包装器,因此您可能应该将TaggedWritable
设为泛型类型,然后在默认的 no-arg 构造函数中使用反射分配给TaggedWritable.data
。
对于最后两个编译器错误,要使用hadoop-datajoin
我注意到您需要使用旧的 API 类。因此,所有这些
org.apache.hadoop.mapreduce.Job;
org.apache.hadoop.mapreduce.Mapper;
org.apache.hadoop.mapreduce.Reducer;
org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
应替换为其旧的 API 等效项。所以org.apache.hadoop.mapred.JobConf
而不是org.apache.hadoop.mapreduce.Job
等等。这将处理您的最后两条错误消息。
我有hadoop-2,7,1,对我来说,在pom中添加了来自MAven的依赖.xml
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-datajoin</artifactId>
<version>2.7.1</version>
</dependency>
这是 hadoop-datajoin 的 URL: https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-datajoin