我正在尝试通过flink消费kafka,并将结果保存到HDFS,但始终没有产生文件。
顺便说一句,可以保存到本地文件,但是当我更改HDFS的路径时,我什么都没有。
object kafka2Hdfs {
private val ZOOKEEPER_HOST = "ip1:2181,ip2:2181,ip3:2181"
private val KAFKA_BROKER = "ip1:9092,ip2:9092,ip3:9092"
private val TRANSACTION_GROUP = "transaction"
val topic = "tgt3"
def main(args : Array[String]){
val env = StreamExecutionEnvironment.getExecutionEnvironment
env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
env.enableCheckpointing(1000L)
env.getCheckpointConfig.setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE)
// configure Kafka consumer
val kafkaProps = new Properties()
.... //topic infos
kafkaProps.setProperty("fs.default-scheme", "hdfs://ip:8020")
val consumer = new FlinkKafkaConsumer010[String](topic, new SimpleStringSchema(), kafkaProps)
val source = env.addSource(consumer)
val path = new Path("/user/jay/data")
// sink
val rollingPolicy : RollingPolicy[String,String] = DefaultRollingPolicy.create()
.withRolloverInterval(15000)
.build()
val sink: StreamingFileSink[String] = StreamingFileSink
.forRowFormat(path, new SimpleStringEncoder[String]("UTF-8"))
.withRollingPolicy(rollingPolicy)
.build()
source.addSink(sink)
env.execute("test")
}
}
我很困惑..
我的头顶,可能有两件事可以看:
- 是否正确配置了HDFS Namenode,以便Flink知道它试图写入HDF而不是本地磁盘?
- NodeManger和TaskManager日志怎么说?由于HDFS上的许可问题,它可能失败。