我当前正在开发一个flink 1.4应用程序,该应用程序从Hadoop群集中读取AVRO文件。但是,在我的IDE上以本地模式运行它非常好。但是,当我将其提交给Jobmanager Flink时,它总是会失败以下消息:
java.io.IOException: Error opening the Input Split hdfs://namenode/topics/CaseLocations/partition=0/CaseLocations+0+0000155791+0000255790.avro [0,16549587]: Could not find a file system implementation for scheme 'hdfs'. The scheme is not directly supported by Flink and no Hadoop file system to support this scheme could be loaded.
at org.apache.flink.api.common.io.FileInputFormat.open(FileInputFormat.java:705)
at org.apache.flink.formats.avro.AvroInputFormat.open(AvroInputFormat.java:110)
at org.apache.flink.formats.avro.AvroInputFormat.open(AvroInputFormat.java:54)
at org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:145)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Could not find a file system implementation for scheme 'hdfs'. The scheme is not directly supported by Flink and no Hadoop file system to support this scheme could be loaded.
at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:405)
at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:320)
at org.apache.flink.api.common.io.FileInputFormat$InputSplitOpenThread.run(FileInputFormat.java:864)
Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Hadoop File System abstraction does not support scheme 'hdfs'. Either no file system implementation exists for that scheme, or the relevant classes are missing from the classpath.
at org.apache.flink.runtime.fs.hdfs.HadoopFsFactory.create(HadoopFsFactory.java:102)
at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:401)
... 2 more
Caused by: java.io.IOException: No FileSystem for scheme: hdfs
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2798)
at org.apache.flink.runtime.fs.hdfs.HadoopFsFactory.create(HadoopFsFactory.java:99)
... 3 more
我正在使用官方的Flink Docker Image flink:1.4.0-hadoop28-scala_2.11
运行集群,该图像应该已经包含Hadoop发行。
我还试图将依赖项添加到我的应用程序罐中,但这也无济于事。这是我的SBT依赖性:
val flinkVersion = "1.4.0"
val hadoopVersion = "2.8.1"
val providedDependencies = Seq(
"org.apache.flink" %% "flink-clients" % flinkVersion,
"org.apache.flink" %% "flink-scala" % flinkVersion,
"org.apache.flink" %% "flink-streaming-scala" % flinkVersion
)
val compiledDependencies = Seq(
"org.apache.flink" % "flink-hadoop-fs" % flinkVersion,
"org.apache.hadoop" % "hadoop-hdfs" % hadoopVersion,
"org.apache.hadoop" % "hadoop-common" % hadoopVersion,
"org.apache.flink" % "flink-avro" % flinkVersion,
"org.apache.flink" %% "flink-table" % flinkVersion,
"org.scalaj" %% "scalaj-http" % "2.2.1"
)
另外,文件系统类也包括在我的META-INF/services/org.apache.hadoop.fs.FileSystem
中。
我想念什么吗?官方文件无法帮助我。
预先感谢
首先,您需要HDFS的群集。
第二,您需要检查Flink_home/lib。
如果您打算将Apache Flink与Apache Hadoop一起使用(在纱线上运行flink,连接到HDFS,连接到HBase或使用一些基于Hadoop的文件系统连接器(,然后选择捆绑匹配的Hadoop版本的下载,下载,下载与您的版本匹配并将其放置在Flink的Lib文件夹中或导出您的Hadoop_classpath的可选预捆式Hadoop。
我今天得到了相同的启示,并通过两个步骤
修复了它- 检查hadoop_conf_dir(或hadoop_home,hadoop_classpath(配置正确
- 检查Flink_home/lib具有flink Shaded-Hadoop-2-uber-xxx.jar,如果不是,请从此处下载
如果两个步骤不正常,则可能需要重新启动Flink Cluster:(