在 java 中访问 hdfs 会抛出错误



我可以通过hdfs dfs -ls /访问终端中的hdfs,并通过hdfs getconf -confKey fs.defaultFS获取集群的地址和端口(我在下面的代码中引用地址和端口(。

尝试在 java 中读取 hdfs 上的文件给了我类似的错误,正如这里描述的那样(这个问题中也有讨论(。 使用地址,我在java中尝试以下操作

FileSystem fs;
BufferedReader br;
String line;
Path path = new Path("hdfs://<address>:<port>/somedata.txt");
try 
{
/* --------------------------
* Option 1: Gave 'Wrong FS: hdfs://..., Expected file:///' error
Configuration configuration = new Configuration();
configuration.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
configuration.addResource(new Path("/etc/hadoop/conf/hdfs-site.xml"));
fs = FileSystem.get(configuration);
* ---------------------------
*/
// --------------------------
// Option 2: Gives error stated below
Configuration configuration = new Configuration();
fs = FileSystem.get(new URI("hdfs://<address>:<port>"),configuration);
// --------------------------
LOG.info(fs.getConf().toString());
FSDataInputStream fsDataInputStream = fs.open(path);
InputStreamReader inputStreamReader = new InputStreamReader(fsDataInputStream);
BufferedReader bufferedReader = new BufferedReader(inputStreamReader);
while((line=bufferedReader.readLine())!=null){
// some file processing code here.
}
bufferedReader .close();
} 
catch (Exception e) 
{
fail();
}

选项 2 给我的错误是

java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(Ljava/lang/String;)Ljava/net/InetSocketAddress;
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:99)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1446)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:67)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1464)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:263)
at fwt.gateway.Test_Runner.checkLocationMasterindicesOnHDFS(Test_Runner.java:76)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:86)
at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:459)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:678)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:382)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192)

我可以从终端访问文件的事实对我来说表明core-site.xmlhdfs-site.xml必须正确。

感谢您的帮助!

编辑1:我用于以下代码的maven依赖项如下

<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>3.0.0-alpha4</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
<version>1.2.1</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>3.0.0-alpha4</version>
</dependency>

将您的 POM 更新为以下内容:

<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.8.1</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
<version>2.6.0-mr1-cdh5.4.2.1</version>
<type>pom</type>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>2.8.1</version>
</dependency>

切勿使用 alpha 版本,因为它们可能存在错误。

你可以在pom.xml文件中使用它

<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.6.0</version>
</dependency>

我使用了2.6.0版。您可以尝试任何更新的版本。

相关内容

最新更新