我的 cdh5.2 集群在运行 hbase MR 作业时出现问题。
例如,我将 hbase 类路径添加到 hadoop类路径中:
vi /etc/hadoop/conf/hadoop-env.sh
添加行:
export HADOOP_CLASSPATH="/usr/lib/hbase/bin/hbase classpath:$HADOOP_CLASSPATH"
当我跑步时: hadoop jar /usr/lib/hbase/hbase-server-0.98.6-cdh5.2.1.jar rowcounter "mytable"
我得到以下异常:
14/12/09 03:44:02 WARN security.UserGroupInformation: PriviledgedActionException as:root (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not exist: hdfs://clusterName/usr/lib/hbase/lib/hbase-client-0.98.6-cdh5.2.1.jar
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hbase.mapreduce.Driver.main(Driver.java:54)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: java.io.FileNotFoundException: File does not exist: hdfs://clusterName/usr/lib/hbase/lib/hbase-client-0.98.6-cdh5.2.1.jar
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1083)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1075)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1075)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:93)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:394)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1295)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1292)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1292)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1313)
at org.apache.hadoop.hbase.mapreduce.RowCounter.main(RowCounter.java:191)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:153)
所以,问题是环境问题:当我将下面的jars添加到/usr/lib/hadoop/lib中时。一切正常
hbase-client-0.98.6-cdh5.2.1.jar
hbase-common-0.98.6-cdh5.2.1.jar
hbase-protocol-0.98.6-cdh5.2.1.jar
hbase-server-0.98.6-cdh5.2.1.jar
hbase-prefix-tree-0.98.6-cdh5.2.1.jar
hadoop-core-2.5.0-mr1-cdh5.2.1.jar
htrace-core-2.04.jar
我的机器有以下转速:
>> rpm -qa | grep cdh
zookeeper-3.4.5+cdh5.2.1+84-1.cdh5.2.1.p0.13.el6.x86_64
hadoop-2.5.0+cdh5.2.1+578-1.cdh5.2.1.p0.14.el6.x86_64
hadoop-0.20-mapreduce-2.5.0+cdh5.2.1+578-1.cdh5.2.1.p0.14.el6.x86_64
hbase-regionserver-0.98.6+cdh5.2.1+64-1.cdh5.2.1.p0.9.el6.x86_64
cloudera-cdh-5-0.x86_64
bigtop-utils-0.7.0+cdh5.2.1+0-1.cdh5.2.1.p0.13.el6.noarch
bigtop-jsvc-0.6.0+cdh5.2.1+578-1.cdh5.2.1.p0.13.el6.x86_64
parquet-1.5.0+cdh5.2.1+38-1.cdh5.2.1.p0.12.el6.noarch
hadoop-hdfs-2.5.0+cdh5.2.1+578-1.cdh5.2.1.p0.14.el6.x86_64
hadoop-mapreduce-2.5.0+cdh5.2.1+578-1.cdh5.2.1.p0.14.el6.x86_64
hadoop-0.20-mapreduce-tasktracker-2.5.0+cdh5.2.1+578-1.cdh5.2.1.p0.14.el6.x86_64
hbase-0.98.6+cdh5.2.1+64-1.cdh5.2.1.p0.9.el6.x86_64
avro-libs-1.7.6+cdh5.2.1+69-1.cdh5.2.1.p0.13.el6.noarch
parquet-format-2.1.0+cdh5.2.1+6-1.cdh5.2.1.p0.14.el6.noarch
hadoop-yarn-2.5.0+cdh5.2.1+578-1.cdh5.2.1.p0.14.el6.x86_64
hadoop-hdfs-datanode-2.5.0+cdh5.2.1+578-1.cdh5.2.1.p0.14.el6.x86_64
我仍然想知道缺少哪个转速。
即使我对 CDH 5.2.0 也有同样的问题。 作为一种解决方法,我手动将jar文件复制到hdfs中,然后没有出现异常
您可以将 hbase 库路径添加到 .bashrc 文件,而不是手动将文件添加到 HDFS。将 hbase 中的 lib 文件夹添加到类路径中。此外,将 hbase 的类路径添加到HADOOP_CLASSPATH。
您的 .bashrc 文件应包含以下内容:
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:`${HBASE_HOME}/bin/hbase classpath`
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:`${HBASE_HOME}/bin/hbase mapredcp`
export CLASSPATH=${HBASE_HOME}/lib/*
注意:类路径应指向 hbase 安装文件夹的 lib 文件夹。使用以下命令编译和运行 Java 代码。
javac Example.java
java -classpath $CLASSPATH:. Example