Docker群覆盖网络ICMP工作,但不能做其他任何事情



我有一个小型的1-manager, 3-worker集群设置来试验一些事情。它正在运行集群编排,能够从任何堆栈中启动跨集群的服务,并通过入口网络为web应用提供服务。我没有对docker-ce的默认yum安装做任何更改。没有对任何节点进行配置更改的香草安装。

然而,在其他覆盖网络上存在服务间通信问题。我用——attachable标志创建了一个docker覆盖网络testnet,并在node-1上附加了一个nginx(名为:nginx1)容器,在manager-1上附加了一个netshoot(名为:netshoot1)容器。

我可以从netshoot1 ping nginx1,反之亦然。我可以在两个节点上通过tcpdump观察这些数据包交换。

# tcpdump -vvnn -i any src 10.1.72.70 and dst 10.1.72.71 and port 4789
00:20:39.302561 IP (tos 0x0, ttl 64, id 49791, offset 0, flags [none], proto UDP (17), length 134)
10.1.72.70.53237 > 10.1.72.71.4789: [udp sum ok] VXLAN, flags [I] (0x08), vni 4101
IP (tos 0x0, ttl 64, id 20598, offset 0, flags [DF], proto ICMP (1), length 84)
10.0.5.18 > 10.0.5.24: ICMP echo request, id 21429, seq 1, length 64

可以看到netshoot1 (10.0.5.18) ping nginx1 (10.0.5.24) - echo successful.

但是如果我接着# curl -v nginx1:80,整个程序就会超时。

使用tcpdump,我可以看到数据包离开manager-1节点,但它们从未到达node-1。

00:22:22.809057 IP (tos 0x0, ttl 64, id 42866, offset 0, flags [none], proto UDP (17), length 110)
10.1.72.70.53764 > 10.1.72.71.4789: [bad udp cksum 0x5b97 -> 0x697d!] VXLAN, flags [I] (0x08), vni 4101
IP (tos 0x0, ttl 64, id 43409, offset 0, flags [DF], proto TCP (6), length 60)
10.0.5.18.53668 > 10.0.5.24.80: Flags [S], cksum 0x1e58 (incorrect -> 0x2c3e), seq 1616566654, win 28200, options [mss 1410,sack OK,TS val 913132903 ecr 0,nop,wscale 7], length 0

这些是运行在vmware内部数据中心上的虚拟机。网络团队说网络防火墙不应该阻止或检查它们,因为ip在同一子网上。

是docker配置的问题吗?Iptables吗?

OS: RHEL 8

Docker CE: 20.10.2

containerd: 3

IPTABLES on manager-1

Chain INPUT (policy DROP 0 packets, 0 bytes)
num   pkts bytes target     prot opt in     out     source               destination
1    9819K 2542M ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            state RELATED,ESTABLISHED
2        8   317 ACCEPT     icmp --  *      *       0.0.0.0/0            0.0.0.0/0            icmptype 255
3      473 33064 ACCEPT     all  --  lo     *       0.0.0.0/0            0.0.0.0/0
4        0     0 DROP       all  --  *      *       127.0.0.0/8          0.0.0.0/0
5      116  6192 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            state NEW tcp dpt:22
6     351K   21M ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            source IP range 10.1.72.71-10.1.72.73 state NEW multiport dports 2377,7946 
7      435 58400 ACCEPT     udp  --  *      *       0.0.0.0/0            0.0.0.0/0            source IP range 10.1.72.71-10.1.72.73 state NEW multiport dports 7946,4789
8    17142 1747K REJECT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            reject-with icmp-host-prohibited
Chain FORWARD (policy DROP 8 packets, 384 bytes)
num   pkts bytes target     prot opt in     out     source               destination
1    14081   36M DOCKER-USER  all  --  *      *       0.0.0.0/0            0.0.0.0/0
2    14081   36M DOCKER-INGRESS  all  --  *      *       0.0.0.0/0            0.0.0.0/0
3     267K  995M DOCKER-ISOLATION-STAGE-1  all  --  *      *       0.0.0.0/0            0.0.0.0/0
4    39782  121M ACCEPT     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
5     1598 95684 DOCKER     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0
6    41470  717M ACCEPT     all  --  docker0 !docker0  0.0.0.0/0            0.0.0.0/0
7        0     0 ACCEPT     all  --  docker0 docker0  0.0.0.0/0            0.0.0.0/0
8    90279   23M ACCEPT     all  --  *      docker_gwbridge  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
9        5   300 DOCKER     all  --  *      docker_gwbridge  0.0.0.0/0            0.0.0.0/0
10   94041  134M ACCEPT     all  --  docker_gwbridge !docker_gwbridge  0.0.0.0/0            0.0.0.0/0
11       0     0 DROP       all  --  docker_gwbridge docker_gwbridge  0.0.0.0/0            0.0.0.0/0
Chain OUTPUT (policy ACCEPT 11M packets, 2365M bytes)
num   pkts bytes target     prot opt in     out     source               destination
Chain DOCKER (2 references)
num   pkts bytes target     prot opt in     out     source               destination
1     1598 95684 ACCEPT     tcp  --  !docker0 docker0  0.0.0.0/0            172.17.0.2           tcp dpt:5000
Chain DOCKER-ISOLATION-STAGE-1 (1 references)
num   pkts bytes target     prot opt in     out     source               destination
1    41470  717M DOCKER-ISOLATION-STAGE-2  all  --  docker0 !docker0  0.0.0.0/0            0.0.0.0/0
2    93853  133M DOCKER-ISOLATION-STAGE-2  all  --  docker_gwbridge !docker_gwbridge  0.0.0.0/0            0.0.0.0/0
3     267K  995M RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0
Chain DOCKER-USER (1 references)
num   pkts bytes target     prot opt in     out     source               destination
1    1033K 1699M RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0
Chain DOCKER-INGRESS (1 references)
num   pkts bytes target     prot opt in     out     source               destination
1        0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:8502
2        0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            state RELATED,ESTABLISHED tcp spt:8502
3     267K  995M RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0
Chain DOCKER-ISOLATION-STAGE-2 (2 references)
num   pkts bytes target     prot opt in     out     source               destination
1        0     0 DROP       all  --  *      docker0  0.0.0.0/0            0.0.0.0/0
2        0     0 DROP       all  --  *      docker_gwbridge  0.0.0.0/0            0.0.0.0/0
3     135K  851M RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0

node-1上的IPTABLES

Chain INPUT (policy DROP 0 packets, 0 bytes)
num   pkts bytes target     prot opt in     out     source               destination
1    6211K 3343M ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            state RELATED,ESTABLISHED
2        7   233 ACCEPT     icmp --  *      *       0.0.0.0/0            0.0.0.0/0            icmptype 255
3      471 32891 ACCEPT     all  --  lo     *       0.0.0.0/0            0.0.0.0/0
4        0     0 DROP       all  --  *      *       127.0.0.0/8          0.0.0.0/0
5       84  4504 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            state NEW tcp dpt:22 /* ssh from anywhere */
6    26940 1616K ACCEPT     tcp  --  *      *       10.1.72.70           0.0.0.0/0            state NEW multiport dports 7946 /* docker swarm cluster comm- manager,node2,3 */
7    31624 1897K ACCEPT     tcp  --  *      *       10.1.72.72           0.0.0.0/0            state NEW multiport dports 7946 /* docker swarm cluster comm- manager,node2,3 */
8    30583 1835K ACCEPT     tcp  --  *      *       10.1.72.73           0.0.0.0/0            state NEW multiport dports 7946 /* docker swarm cluster comm- manager,node2,3 */
9      432 58828 ACCEPT     udp  --  *      *       10.1.72.70           0.0.0.0/0            state NEW multiport dports 7946,4789 /* docker swarm cluster comm and overlay netw- manager,node2,3 */
10      10  1523 ACCEPT     udp  --  *      *       10.1.72.72           0.0.0.0/0            state NEW multiport dports 7946,4789 /* docker swarm cluster comm and overlay netw- manager,node2,3 */
11       7  1159 ACCEPT     udp  --  *      *       10.1.72.73           0.0.0.0/0            state NEW multiport dports 7946,4789 /* docker swarm cluster comm and overlay netw- manager,node2,3 */
12   17172 1749K REJECT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            reject-with icmp-host-prohibited
Chain FORWARD (policy DROP 19921 packets, 1648K bytes)
num   pkts bytes target     prot opt in     out     source               destination
1    23299   22M DOCKER-USER  all  --  *      *       0.0.0.0/0            0.0.0.0/0
2    23299   22M DOCKER-INGRESS  all  --  *      *       0.0.0.0/0            0.0.0.0/0
3     787K 1473M DOCKER-ISOLATION-STAGE-1  all  --  *      *       0.0.0.0/0            0.0.0.0/0
4        0     0 ACCEPT     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
5        0     0 DOCKER     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0
6        0     0 ACCEPT     all  --  docker0 !docker0  0.0.0.0/0            0.0.0.0/0
7        0     0 ACCEPT     all  --  docker0 docker0  0.0.0.0/0            0.0.0.0/0
8     386K  220M ACCEPT     all  --  *      docker_gwbridge  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
9        0     0 DOCKER     all  --  *      docker_gwbridge  0.0.0.0/0            0.0.0.0/0
10    402K 1254M ACCEPT     all  --  docker_gwbridge !docker_gwbridge  0.0.0.0/0            0.0.0.0/0
11       0     0 DROP       all  --  docker_gwbridge docker_gwbridge  0.0.0.0/0            0.0.0.0/0
Chain OUTPUT (policy ACCEPT 8193K packets, 2659M bytes)
num   pkts bytes target     prot opt in     out     source               destination
Chain DOCKER-INGRESS (1 references)
num   pkts bytes target     prot opt in     out     source               destination
1        0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:8502
2        0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            state RELATED,ESTABLISHED tcp spt:8502
3     787K 1473M RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0
Chain DOCKER-USER (1 references)
num   pkts bytes target     prot opt in     out     source               destination
1     792K 1474M RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0
Chain DOCKER (2 references)
num   pkts bytes target     prot opt in     out     source               destination
Chain DOCKER-ISOLATION-STAGE-1 (1 references)
num   pkts bytes target     prot opt in     out     source               destination
1        0     0 DOCKER-ISOLATION-STAGE-2  all  --  docker0 !docker0  0.0.0.0/0            0.0.0.0/0
2     402K 1254M DOCKER-ISOLATION-STAGE-2  all  --  docker_gwbridge !docker_gwbridge  0.0.0.0/0            0.0.0.0/0
3     787K 1473M RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0
Chain DOCKER-ISOLATION-STAGE-2 (2 references)
num   pkts bytes target     prot opt in     out     source               destination
1        0     0 DROP       all  --  *      docker0  0.0.0.0/0            0.0.0.0/0
2        0     0 DROP       all  --  *      docker_gwbridge  0.0.0.0/0            0.0.0.0/0
3     402K 1254M RETURN     all  --  *      *       0.0.0.0/0            0.0.0.0/0

问题确实是出站数据包的错误校验和。由于校验和错误,vmware网络接口正在丢弃数据包。

解决方案是禁用校验和卸载。使用ethtool:

# ethtool -K <interface> tx off

我有完全相同的问题(在我的覆盖网络中唯一工作的是ping,其他一切都消失了)。这个帖子救了我的命,我整天都在拉扯我的头发,所以我想我应该加上我的五分钱。

这也是在运行Ubuntu 22.04的vmware服务器上。我的解决方案是将网络接口类型从vmxnet3更改为简单的E1000E卡,突然一切都开始工作了。很明显,vmxnet3中发生了一些奇怪的事情。让我感到困惑的是,对于更多的用户来说,这似乎不是一个大问题,在vmware服务器上运行Docker群应该有点正常,对吧?

同样的问题解决了我没有使用"ethtool",只是通过使用endpoint_mode和使用主机模式发布端口。下面是我在compose中添加的修改:

1-
ports:
- target: 2379
published: 2379
protocol: tcp
mode: host
2- 
deploy:
endpoint_mode: dnsrr
3- adding
hostname: <service_name>

如果有人尝试ethtool -K <iface name> tx off仍然不起作用,请尝试将覆盖网络的MTU大小更改为比标准(1500)更大。

例如:

docker network create -d overlay --attachable --opt com.docker.network.driver.mtu=1450 my-network

相关内容

最新更新