我使用的是Hadoop 2.2.0。hadoop-mapreduce-examples--2.2.0.jar在hdfs上运行良好。
我在eclipse中制作了一个wordcount程序,并使用maven添加jar并运行这个jar:
ubuntu@ubuntu-linux:~$ yarn jar Sample-0.0.1-SNAPSHOT.jar com.vij.Sample.WordCount /user/ubuntu/wordcount/input/vij.txt user/ubuntu/wordcount/output
它给出以下错误:15/02/17 13:09:09 WARN util.NativeCodeLoader: Unable to load
native-hadoop library for your platform... using builtin-java classes
where applicable
15/02/17 13:09:10 INFO client.RMProxy: Connecting to ResourceManager
at /0.0.0.0:8032
15/02/17 13:09:11 ERROR security.UserGroupInformation:
PriviledgedActionException as:ubuntu (auth:SIMPLE)
cause:org.apache.hadoop.mapred.FileAlreadyExistsException: Output
directory hdfs://localhost:54310/user/ubuntu/wordcount/input/vij.txt
already exists
Exception in thread "main"
org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory
hdfs://localhost:54310/user/ubuntu/wordcount/input/vij.txt already
exists
at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:146)
at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:456)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:342)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
at com.vij.Sample.WordCount.main(WordCount.java:33)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
jar在我的本地系统上。输入和输出路径都在hdfs上。hdfs的输出路径上不存在输出目录。
请给我建议。谢谢
实际错误为:
ERROR security.UserGroupInformation:PriviledgedActionException as:ubuntu (auth:SIMPLE)
cause:org.apache.hadoop.mapred.FileAlreadyExistsException: Output
directory hdfs://localhost:54310/user/ubuntu/wordcount/input/vij.txt
already exists
删除已存在的输出文件"vij.txt
",或输出到其他文件。
或者尝试执行以下步骤:
从$HADOOP_HOME
下的链接下载并解压缩WordCount源代码。
$ cd $HADOOP_HOME
$ wget http://salsahpc.indiana.edu/tutorial/source_code/Hadoop-WordCount.zip
$ unzip Hadoop-WordCount.zip
然后,将输入文件(任何文本格式的文件)上传到Hadoop分布式文件系统(HDFS):
$bin/hadoop fs -put $HADOOP_HOME/Hadoop-WordCount/input/ input
$bin/hadoop fs -ls input
这里,$HADOOP_HOME/Hadoop-WordCount/input/
是存储程序输入的本地目录。第二个"输入"表示HDFS上的远程目标目录。
将输入上传到HDFS后,使用以下命令运行WordCount程序。我们假设您已经编译了单词计数程序。
$ bin/hadoop jar $HADOOP_HOME/Hadoop-WordCount/wordcount.jar WordCount input output
如果Hadoop运行正常,它将打印类似于以下内容的Hadoop运行消息:
WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated.
Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.
11/11/02 18:34:46 INFO input.FileInputFormat: Total input paths to process : 1
11/11/02 18:34:46 INFO mapred.JobClient: Running job: job_201111021738_0001 11/11/02 18:34:47 INFO mapred.JobClient: map 0% reduce 0%
11/11/02 18:35:01 INFO mapred.JobClient: map 100% reduce 0%
11/11/02 18:35:13 INFO mapred.JobClient: map 100% reduce 100%
11/11/02 18:35:18 INFO mapred.JobClient: Job complete: job_201111021738_0001 11/11/02 18:35:18 INFO mapred.JobClient: Counters: 25
...