与[null]的远程连接失败,原因是java.net.NoRouteToHostException:在taskmanag



当我在kubernetes(v1.15.2(集群中启动我的apache flink 1.10任务管理器服务时,它显示的日志如下:

2020-05-01 08:34:55,847 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could not resolve ResourceManager address akka.tcp://flink@flink-jobmanager:6123/user/resourcemanager, retrying in 10000 ms: Could not connect to rpc endpoint under address akka.tcp://flink@flink-jobmanager:6123/user/resourcemanager..
2020-05-01 08:34:55,847 WARN  akka.remote.transport.netty.NettyTransport                    - Remote connection to [null] failed with java.net.NoRouteToHostException: No route to host
2020-05-01 08:34:55,848 WARN  akka.remote.ReliableDeliverySupervisor                        - Association with remote system [akka.tcp://flink@flink-jobmanager:6123] has failed, address is now gated for [50] ms. Reason: [Association failed with [akka.tcp://flink@flink-jobmanager:6123]] Caused by: [java.net.NoRouteToHostException: No route to host]
2020-05-01 08:35:08,874 WARN  akka.remote.transport.netty.NettyTransport                    - Remote connection to [null] failed with java.net.NoRouteToHostException: No route to host
2020-05-01 08:35:08,877 WARN  akka.remote.ReliableDeliverySupervisor                        - Association with remote system [akka.tcp://flink@flink-jobmanager:6123] has failed, address is now gated for [50] ms. Reason: [Association failed with [akka.tcp://flink@flink-jobmanager:6123]] Caused by: [java.net.NoRouteToHostException: No route to host]
2020-05-01 08:35:08,878 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could not resolve ResourceManager address akka.tcp://flink@flink-jobmanager:6123/user/resourcemanager, retrying in 10000 ms: Could not connect to rpc endpoint under address akka.tcp://flink@flink-jobmanager:6123/user/resourcemanager..
2020-05-01 08:35:21,907 WARN  akka.remote.transport.netty.NettyTransport                    - Remote connection to [null] failed with java.net.NoRouteToHostException: No route to host

而taskmanager无法注册成功,我登录taskmanager,发现我可以成功ping jobmanager,如下所示:

flink@flink-taskmanager-54d85f57c7-nl9cf:~$ ping flink-jobmanager
PING flink-jobmanager.dabai-fat.svc.cluster.local (10.254.58.171) 56(84) bytes of data.
64 bytes from flink-jobmanager.dabai-fat.svc.cluster.local (10.254.58.171): icmp_seq=1 ttl=64 time=0.045 ms
64 bytes from flink-jobmanager.dabai-fat.svc.cluster.local (10.254.58.171): icmp_seq=2 ttl=64 time=0.076 ms
64 bytes from flink-jobmanager.dabai-fat.svc.cluster.local (10.254.58.171): icmp_seq=3 ttl=64 time=0.079 ms

那么为什么会发生这种情况,我该怎么办才能解决呢?

尝试在kubernetes taskmanger的pod容器中安装nmap:

apt-get udpate
apt-get install nmap -y

然后扫描作业管理器,确保pod的暴露端口6123是可访问的(在我的情况下,我发现无法从当前pod访问端口6123(。

nmap -T4 <your-jobmanager's-pod-ip>

希望得到帮助。

相关内容

  • 没有找到相关文章

最新更新