为什么加载seccomp筛选器会影响允许和有效的功能集



我最近正在用libcaplibseccomp编写程序,当它们一起使用时,我注意到了一个问题。

在下面的最小可复制示例中,我首先将当前进程的能力设置为仅P(inheritable) = CAP_NET_RAW,并清除其他能力集。然后,我用SCMP_ACT_ALLOW操作初始化一个seccomp过滤器(默认情况下允许所有系统调用(,加载并清理它。

最后,该程序打印其当前能力,并在执行execve()之后执行capsh --print以显示其能力。

#include <linux/capability.h>
#include <sys/capability.h>
#include <unistd.h>
#include <sys/types.h>
#include <stdio.h>
#include <seccomp.h>
#define CAPSH "/usr/sbin/capsh"
int main(void) {
cap_value_t net_raw = CAP_NET_RAW;
cap_t caps = cap_init();
cap_set_flag(caps, CAP_INHERITABLE, 1, &net_raw, CAP_SET);
if (cap_set_proc(caps)) {
perror("cap_set_proc");
}
cap_free(caps);
scmp_filter_ctx ctx;
if ((ctx = seccomp_init(SCMP_ACT_ALLOW)) == NULL) {
perror("seccomp_init");
}
int rc = 0;
rc = seccomp_load(ctx); // comment this line later
if (rc < 0)
perror("seccomp_load");
seccomp_release(ctx);
ssize_t y = 0;
printf("Process capabilities: %sn", cap_to_text(cap_get_proc(), &y));

char *argv[] = {
CAPSH,
"--print",
NULL
};
execve(CAPSH, argv, NULL);
return -1;
}

使用-lcap-lseccomp编译,在根用户(UID=EUID=0(下执行,并得到以下内容:

Process capabilities: = cap_net_raw+i
Current: = cap_net_raw+i
Bounding set =cap_chown,cap_dac_override,cap_dac_read_search,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_linux_immutable,cap_net_bind_service,cap_net_broadcast,cap_net_admin,cap_net_raw,cap_ipc_lock,cap_ipc_owner,cap_sys_module,cap_sys_rawio,cap_sys_chroot,cap_sys_ptrace,cap_sys_pacct,cap_sys_admin,cap_sys_boot,cap_sys_nice,cap_sys_resource,cap_sys_time,cap_sys_tty_config,cap_mknod,cap_lease,cap_audit_write,cap_audit_control,cap_setfcap,cap_mac_override,cap_mac_admin,cap_syslog,cap_wake_alarm,cap_block_suspend,cap_audit_read
Securebits: 00/0x0/1'b0
secure-noroot: no (unlocked)
secure-no-suid-fixup: no (unlocked)
secure-keep-caps: no (unlocked)
uid=0(root)
gid=0(root)
groups=0(root)

表示当前进程和执行的capsh都有可继承集,而不是仅为空。然而,如果我评论rc = seccomp_load(ctx);行,情况就不一样了:

Process capabilities: = cap_net_raw+i
Current: = cap_net_raw+eip cap_chown,cap_dac_override,cap_dac_read_search,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_linux_immutable,cap_net_bind_service,cap_net_broadcast,cap_net_admin,cap_ipc_lock,cap_ipc_owner,cap_sys_module,cap_sys_rawio,cap_sys_chroot,cap_sys_ptrace,cap_sys_pacct,cap_sys_admin,cap_sys_boot,cap_sys_nice,cap_sys_resource,cap_sys_time,cap_sys_tty_config,cap_mknod,cap_lease,cap_audit_write,cap_audit_control,cap_setfcap,cap_mac_override,cap_mac_admin,cap_syslog,cap_wake_alarm,cap_block_suspend,cap_audit_read+ep
Bounding set =cap_chown,cap_dac_override,cap_dac_read_search,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_linux_immutable,cap_net_bind_service,cap_net_broadcast,cap_net_admin,cap_net_raw,cap_ipc_lock,cap_ipc_owner,cap_sys_module,cap_sys_rawio,cap_sys_chroot,cap_sys_ptrace,cap_sys_pacct,cap_sys_admin,cap_sys_boot,cap_sys_nice,cap_sys_resource,cap_sys_time,cap_sys_tty_config,cap_mknod,cap_lease,cap_audit_write,cap_audit_control,cap_setfcap,cap_mac_override,cap_mac_admin,cap_syslog,cap_wake_alarm,cap_block_suspend,cap_audit_read
Securebits: 00/0x0/1'b0
secure-noroot: no (unlocked)
secure-no-suid-fixup: no (unlocked)
secure-keep-caps: no (unlocked)
uid=0(root)
gid=0(root)
groups=0(root)

execve()之前,结果与上述相同。但在那之后,所有其他能力都回到了允许和有效的状态。

我查阅了功能(7(,在手册中发现了以下内容:

Capabilities and execution of programs by root
In order to mirror traditional UNIX semantics, the kernel performs
special treatment of file capabilities when a process with UID 0
(root) executes a program and when a set-user-ID-root program is exe‐
cuted.
After having performed any changes to the process effective ID that
were triggered by the set-user-ID mode bit of the binary—e.g.,
switching the effective user ID to 0 (root) because a set-user-ID-
root program was executed—the kernel calculates the file capability
sets as follows:
1. If the real or effective user ID of the process is 0 (root), then
the file inheritable and permitted sets are ignored; instead they
are notionally considered to be all ones (i.e., all capabilities
enabled).  (There is one exception to this behavior, described
below in Set-user-ID-root programs that have file capabilities.)
2. If the effective user ID of the process is 0 (root) or the file
effective bit is in fact enabled, then the file effective bit is
notionally defined to be one (enabled).
These notional values for the file's capability sets are then used as
described above to calculate the transformation of the process's
capabilities during execve(2).
Thus, when a process with nonzero UIDs execve(2)s a set-user-ID-root
program that does not have capabilities attached, or when a process
whose real and effective UIDs are zero execve(2)s a program, the cal‐
culation of the process's new permitted capabilities simplifies to:
P'(permitted)   = P(inheritable) | P(bounding)
P'(effective)   = P'(permitted)
Consequently, the process gains all capabilities in its permitted and
effective capability sets, except those masked out by the capability
bounding set.  (In the calculation of P'(permitted), the P'(ambient)
term can be simplified away because it is by definition a proper sub‐
set of P(inheritable).)
The special treatments of user ID 0 (root) described in this subsec‐
tion can be disabled using the securebits mechanism described below.

这就是我感到困惑的地方:可继承集不是空的,根据简化规则,允许集和有效集都不应该是空的。然而;加载seccomp滤波器";似乎违反了这条规则。

Seccomp本身不做这件事,但libseccomp做了。

使用strace,您可以看到seccomp_load实际上执行了三个系统调用:

prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)  = 0
seccomp(SECCOMP_SET_MODE_STRICT, 1, NULL) = -1 EINVAL (Invalid argument)
seccomp(SECCOMP_SET_MODE_FILTER, 0, {len=7, filter=0x5572a6213930}) = 0

注意第一个看起来很可疑。

来自no_new_privs上的内核文档:

设置了no_new_privs后,execve承诺不会授予执行任何没有execve调用就无法执行的操作的权限

从您引用的capabilities(7)来看:

如果进程的真实或有效用户ID为0(root(,则忽略文件可继承集和允许集;相反,它们在理论上被认为是一体的(即所有启用的功能(。

您的代码创建了一个空的功能集(cap_t caps = cap_init()(,并且只将CAP_NET_RAW添加为可继承的,不允许任何功能(如= cap_net_raw+i(。然后,因为为该线程设置了NO_NEW_PRIVS,所以在调用execve时,允许的集合不会像通常对根进程(UID=0或EUID=0(那样恢复为完整集合。这解释了您在使用seccomp_load()之前和之后从capsh --print中看到的内容。

NO_NEW_PRIVS标志一旦设置(prctl(2((就无法重置,seccomp_load()默认设置它是有原因的。

要防止seccomp_load()设置NO_NEW_PRIVS,请在加载上下文之前添加以下代码:

seccomp_attr_set(ctx, SCMP_FLTATR_CTL_NNP, 0);

有关更多详细信息,请参见seccomp_attr_set(3(。

但是,您可能应该以正确的方式执行,同时将所需的功能添加到允许的集合中。

cap_set_flag(caps, CAP_PERMITTED, 1, &net_raw, CAP_SET);

最新更新