我正在尝试使用以下逻辑从 linux 服务框架启动多个 memcached 进程:
RETVAL=0
pcount="$CACHES"
if [ ! -z "$pcount" ]; then
while [ $pcount -gt 0 ];
do
(( pcount-- ))
(( port=PORT + pcount ))
daemon --pidfile ${pidfile}${pcount}.pid memcached -d -p $port -u $USER -m $CACHESIZE -c $MAXCONN -P ${pidfile}${pcount}.pid $OPTIONS
(( RETVAL=RETVAL + $? ))
done
else
daemon --pidfile ${pidfile}.pid memcached -d -p $PORT -u $USER -m $CACHESIZE -c $MAXCONN -P ${pidfile}.pid $OPTIONS
RETVAL=$?
fi
当使用命令运行时 service memcached start
,它会为循环中的每个周期创建和更新 pid 文件,但只有进程的最后一个实例保持运行。也就是说,当每个/var/run/memcached/memcached(1 through 5).pid
都是使用 PID 创建和更新时;这些进程不存在。 还会创建和更新/var/run/memcached/memcached0.pid
,PID 指向正在运行的进程。
打开了跟踪,我可以看到循环已执行并进行了进程调用;但是进程没有启动(或者可能启动并立即终止,因此我不认为它已启动)。
另一方面,直接按/etc/init.d/memcached start
运行此脚本会导致所有进程正确启动。
有人可以帮助我理解为什么service
框架阻止启动除最后一个实例之外的其他实例吗?
正如@nos所建议的,我添加了 strace -f 来跟踪service memcached start
操作期间的调用。我比较了不成功/终止进程和成功进程之间的跟踪调用。我发现的唯一显着差异是:
< bind(26, {sa_family=AF_INET, sin_port=htons(11216), sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EACCES (Permission denied)
< dup(2) = 27
< fcntl(27, F_GETFL) = 0x8002 (flags O_RDWR|O_LARGEFILE)
< fstat(27, {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 3), ...}) = 0
< ioctl(27, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff20d5d780) = -1 ENOTTY (Inappropriate ioctl for device)
< mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5dae958000
< lseek(27, 0, SEEK_CUR) = 0
< write(27, "bind(): Permission deniedn", 26) = 26
< close(27) = 0
< munmap(0x7f5dae958000, 4096) = 0
< close(26) = 0
< dup(2) = 26
< fcntl(26, F_GETFL) = 0x8002 (flags O_RDWR|O_LARGEFILE)
< fstat(26, {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 3), ...}) = 0
< ioctl(26, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff20d5d730) = -1 ENOTTY (Inappropriate ioctl for device)
< mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5dae958000
< lseek(26, 0, SEEK_CUR) = 0
< write(26, "failed to listen on TCP port 112"..., 54) = 54
< close(26) = 0
< munmap(0x7f5dae958000, 4096) = 0
< exit_group(71) = ?
---
> bind(26, {sa_family=AF_INET, sin_port=htons(11211), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
> listen(26, 1024) = 0
> epoll_ctl(3, EPOLL_CTL_ADD, 26, {EPOLLIN, {u32=26, u64=26}}) = 0
> socket(PF_INET6, SOCK_STREAM, IPPROTO_TCP) = 27
> fcntl(27, F_GETFL) = 0x2 (flags O_RDWR)
> fcntl(27, F_SETFL, O_RDWR|O_NONBLOCK) = 0
> setsockopt(27, SOL_IPV6, IPV6_V6ONLY, [1], 4) = 0
> setsockopt(27, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
> setsockopt(27, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0
> setsockopt(27, SOL_SOCKET, SO_LINGER, {onoff=0, linger=0}, 8) = 0
> setsockopt(27, SOL_TCP, TCP_NODELAY, [1], 4) = 0
> bind(27, {sa_family=AF_INET6, sin6_port=htons(11211), inet_pton(AF_INET6, "::", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0
> listen(27, 1024) = 0
> epoll_ctl(3, EPOLL_CTL_ADD, 27, {EPOLLIN, {u32=27, u64=27}}) = 0
> socket(PF_NETLINK, SOCK_RAW, 0) = 28
> bind(28, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 0
> getsockname(28, {sa_family=AF_NETLINK, pid=31943, groups=00000000}, [12]) = 0
> gettimeofday({1393735036, 191154}, NULL) = 0
> sendto(28, "24 26 13|26522S ", 20, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 20
> recvmsg(28, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"0 24 2 |26522S307| 2102003761 10 1 177 1"..., 4096}], msg_controllen=0, msg_flags=0}, 0) = 108
> recvmsg(28, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"@ 24 2 |26522S307| n2002003761 24 1 "..., 4096}], msg_controllen=0, msg_flags=0}, 0) = 128
> recvmsg(28, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"24 3 2 |26522S307| 1 24 1 "..., 4096}], msg_controllen=0, msg_flags=0}, 0) = 20
> close(28) = 0
> socket(PF_INET6, SOCK_DGRAM, IPPROTO_IP) = 28
顶部 (<) 来自终止的进程,底部 (>) 来自最后一个(成功)进程。很明显,由于缺少绑定到端口的权限,该过程正在终止。在进一步查看时,我意识到 SELinux 被设置为 ENFORCE,这会阻止 memcached 服务绑定到 11211(默认端口)以外的端口。
据我所知,当我在没有 service
命令的情况下运行它时,该行为只是一个进程(而不是服务)的行为,因此没有强制执行绑定。
关闭 SELinux 的强制模式,让 service memcached start
命令正常工作!