C - close() x86_64系统调用奇怪的返回值



我的 xinetd 守护进程在内核升级(从 2.6.24 到 2.6.33)后突然停止工作。我运行了一个跟踪并找到了这个:

[...]
close(3)                                = 0
munmap(0x7f1a93b43000, 4096)            = 0
getrlimit(RLIMIT_NOFILE, {rlim_cur=8*1024, rlim_max=16*1024}) = 0
setrlimit(RLIMIT_NOFILE, {rlim_cur=1024, rlim_max=1024}) = 0
close(3)                                = 4294967287
exit_group(1)                           = ?

所以基本上,看起来关闭系统调用返回的内容与 0 或 -1 不同

我做了几次测试,似乎它只发生在 64 位可执行文件上:

$ file closetest32
closetest32: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically linked, not stripped
$ strace closetest32
execve("./closetest32", ["closetest32"], [/* 286 vars */]) = 0
[ Process PID=4731 runs in 32 bit mode. ]
open("/proc/mounts", O_RDONLY)          = 3
close(3)                                = 0
close(3)                                = -1 EBADF (Bad file descriptor)
_exit(0)                                = ?

$ file closetest64
closetest64: ELF 64-bit LSB executable, AMD x86-64, version 1 (SYSV), statically linked, not stripped
$ strace closetest64
execve("./closetest64", ["closetest64"], [/* 286 vars */]) = 0
open("/proc/mounts", O_RDONLY)          = 3
close(3)                                = 0
close(3)                                = 4294967287
_exit(0)                                = ?

我正在运行以下内核:

Linux foobar01 2.6.33.9-rt31.64.el5rt #1 SMP PREEMPT RT Wed May 4 10:34:12 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux

最糟糕的是,我无法在另一台具有相同内核的机器上重现该错误。

有什么想法吗?

编辑:根据要求:这是用于closetest32和closetest64的代码

closetest32.asm:

.section .data
filename:
    .ascii "/proc/mounts"
.section .text
.globl _start
_start:
    xorl %edi, %edi
    movl $5, %eax # open() i386 system call
    leal filename, %ebx # %ebx ---> filename
    movl $0, %esi # O_RDONLY flag into esi
    int $0x80
    xorl %edi, %edi
    movl $6, %eax # close() i386 system call
    movl $3, %ebx # fd 3
    int $0x80
    xorl %edi, %edi
    movl $6, %eax # close() i386 system call
    movl $3, %ebx # fd 3
    int $0x80
    ## terminate program via _exit () system call
    movl $1, %eax # %eax =  _exit() i386 system call
    xorl %ebx, %ebx # %ebx = 0 normal program return code
    int $0x80

编译为:

as test32.asm -o test32.o --32
ld -m elf_i386 test32.o -o closetest32

关闭测试64.asm:

.section .data
filename:
    .ascii "/proc/mounts"
.section .text
.globl _start
_start:
    xorq %rdi, %rdi
    movq $2, %rax # open() system call
    leaq filename, %rdi # %rdi ---> filename
    movq $0, %rsi # O_RDONLY flag into rsi
    syscall
    xorq %rdi, %rdi
    movq $3, %rax # close() system call
    movq $3, %rdi # fd 3
    syscall
    xorq %rdi, %rdi
    movq $3, %rax # close() system call
    movq $3, %rdi # fd 3
    syscall
    ## terminate program via _exit () system call
    movq $60, %rax # %rax = _exit() system call
    xorq %rdi, %rdi # %rdi = 0 normal program return code
    syscall

汇编:

as test64.asm -o test64.o
ld test64.o -o closetest64

正如预期的那样,回滚到以前的内核版本解决了这个问题。我不是真正的内核专家,但据我所知,@R给出的答案。意义:

这是一台 64 位计算机,因此 1<<32-9 不应出现。问题是内核在内部使用 unsigned 而不是 int 作为其中一些函数的返回值,然后返回 -EBADF,它得到减少的模 2^32 而不是模 2^64

问题在于,libc syscall 包装器中处理 syscall 错误返回的通用代码在进行比较时必须将返回值视为 long(因为它可能是指针或某些系统调用的 long),以查看它是否是一个指示错误的小负值。但是内核返回(long)(无符号)-9,这与(long)-9非常不同。或(无符号长)-9(其中任何一个都可以工作)。

最新更新