在我的Docker容器,为什么我仍然可以绑定端口1没有' NET_BIND_SERVICE '的能力?



我使用Ubuntu 18.04 Desktop。以下是我的问题的更多细节。

最近,我写了一些测试代码,想要这样做:当它作为一个非特权用户运行时,测试代码试图绑定一个特权端口(在我的例子中是端口1),并期望绑定失败。

在我的主机上,我当前的非特权用户有以下capsh --print输出:

Current: =
Bounding set =cap_chown,cap_dac_override,cap_dac_read_search,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_linux_immutable,cap_net_bind_service,cap_net_broadcast,cap_net_admin,cap_net_raw,cap_ipc_lock,cap_ipc_owner,cap_sys_module,cap_sys_rawio,cap_sys_chroot,cap_sys_ptrace,cap_sys_pacct,cap_sys_admin,cap_sys_boot,cap_sys_nice,cap_sys_resource,cap_sys_time,cap_sys_tty_config,cap_mknod,cap_lease,cap_audit_write,cap_audit_control,cap_setfcap,cap_mac_override,cap_mac_admin,cap_syslog,cap_wake_alarm,cap_block_suspend,cap_audit_read
Securebits: 00/0x0/1'b0
secure-noroot: no (unlocked)
secure-no-suid-fixup: no (unlocked)
secure-keep-caps: no (unlocked)
uid=1000(ywen)
gid=1000(ywen)
groups=4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),116(lpadmin),126(sambashare),999(docker),1000(ywen)

因此,当尝试使用当前非特权用户绑定端口1时,我可以得到预期的权限拒绝错误:

Python 3.6.9 (default, Oct  8 2020, 12:12:24) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import socket as s
>>> o = s.socket(s.AF_INET)
>>> o.bind(("127.0.0.1", 1))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
PermissionError: [Errno 13] Permission denied
>>> exit()

因为我的测试代码最终将在Docker容器中运行,我使用以下Dockerfile构建了一个映像:

ARG UBUNTU_VERSION=18.04
FROM ubuntu:${UBUNTU_VERSION}
ARG USER_NAME=ywen
ARG USER_ID=1000
ARG GROUP_ID=1000
RUN apt-get update
# Install the needed packages.
RUN DEBIAN_FRONTEND=noninteractive apt-get -y install 
bash-completion 
libcap2-bin 
openssh-server 
openssh-client 
sudo 
tree 
vim
# Add a non-privileged user.
RUN groupadd -g ${GROUP_ID} ${USER_NAME} && 
useradd -r --create-home -u ${USER_ID} -g ${USER_NAME} ${USER_NAME}
# Give the non-privileged user the privilege to run `sudo` without a password.
RUN echo "${USER_NAME} ALL=(ALL:ALL) NOPASSWD: ALL" > /etc/sudoers.d/${USER_NAME}
# Switch to the non-root user.
USER ${USER_NAME}
# The default command when the container is run.
CMD ["/bin/sleep", "infinity"]

,执行以下docker build命令:

docker build -f ./Dockerfile.ubuntu --tag port-binding .

生成的图像称为port-binding:latest

然后运行它,首先使用这里列出的默认功能:

docker run --rm -it --name binding port-binding /bin/bash

然后我登录到容器并运行capsh --print。我:

Current: = cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap+i
Bounding set =cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap
Securebits: 00/0x0/1'b0
secure-noroot: no (unlocked)
secure-no-suid-fixup: no (unlocked)
secure-keep-caps: no (unlocked)
uid=1000(ywen)
gid=1000(ywen)
groups=

目前,我有cap_net_bind_service能力。因此,当我在本文开头运行测试代码时,端口绑定可以成功,并且我没有得到任何错误:

Python 3.6.9 (default, Oct  8 2020, 12:12:24) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import socket as s
>>> o = s.socket(s.AF_INET)
>>> o.bind(("127.0.0.1", 1))    # Succeeded here.
>>>

我认为成功是预料之中的,因为容器具有cap_net_bind_service功能。所以我停止了容器,并开始了一个新的,去掉了cap_net_bind_service:

docker run --rm -it --cap-drop=NET_BIND_SERVICE --name binding port-binding /bin/bash

在新容器中,capsh --print没有显示cap_net_bind_service:

Current: = cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap+i
Bounding set =cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap
Securebits: 00/0x0/1'b0
secure-noroot: no (unlocked)
secure-no-suid-fixup: no (unlocked)
secure-keep-caps: no (unlocked)
uid=1000(ywen)
gid=1000(ywen)
groups=

但是当我运行测试代码时,我发现我仍然可以成功绑定端口1:

Python 3.6.9 (default, Oct  8 2020, 12:12:24) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import socket as s
>>> o = s.socket(s.AF_INET)
>>> o.bind(("127.0.0.1", 1))    # Didn't raise an error. Still succeeded here.
>>>

然而,通过阅读以下帖子,我认为删除NET_BIND_SERVICE应该是正确的事情。很明显,我在某个地方犯了个错误。谁能告诉我我做错了什么?

  • 功能(7)
  • https://superuser.com/a/892391/224429
  • https://serverfault.com/a/112798/125167

我遇到了相反的问题——想绑定到80端口,但做不到。两天的调试导致如下:https://github.com/moby/moby/pull/41030 -自docker 20.03.0默认的sysctl net.ipv4。容器的Ip_unprivileged_port_start设置为0,其效果与cap_net_bind_service相同—容器内的所有进程现在可以绑定到(容器的)任何端口,即使作为无特权用户也是如此。它可以通过docker run --sysctl net.ipv4.ip_unprivileged_port_start=0 ...或docker-compose在外部设置。yml设置

sysctls:
- net.ipv4.ip_unprivileged_port_start=0

将其设置为1024以获得与docker 20.03.0之前相同的行为