我有一个MongoDBStatefulSet
运行在OpenShift v3.11。PersistentVolume
使用NFSv4
在我们的环境中,我设置NFS服务器中的目录归nfsnobody:nfsnobody
所有。SELinux也被设置为Permissive. 所有的内部目录和文件也被授予chmod ug+rwx,o-rwx
权限。
这样做是为了在运行时,当Pod使用组root (gid=0)
的用户访问共享路径时,由于NFS默认将用户和组root
压缩为nfsnobody
, Pod将能够读写共享路径。
$> ls -halZ /srv/share/openshift/mongo/
drwxrwx---. nfsnobody nfsnobody unconfined_u:object_r:default_t:s0 data
这个设置已经工作了几个月了。但它开始失败。
然而,当我部署Pod时,它无法启动,并出现以下错误:
021-01-26T16:12:48.163+0000 W STORAGE [initandlisten] Detected unclean shutdown - /var/lib/mongodb/data/mongod.lock is not empty.
2021-01-26T16:12:48.163+0000 I STORAGE [initandlisten] Detected data files in /var/lib/mongodb/data created by the 'wiredTiger' storage engine, so setting theactive storage engine to 'wiredTiger'.
2021-01-26T16:12:48.163+0000 W STORAGE [initandlisten] Recovering data from the last clean checkpoint.
2021-01-26T16:12:48.164+0000 I STORAGE [initandlisten] wiredtiger_open config:create,cache_size=31220M,session_max=20000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),statistics_log=(wait=0),verbose=(recovery_progress),
2021-01-26T16:12:48.688+0000 E STORAGE [initandlisten] WiredTiger error (1) [1611677568:688148][457:0x7f9b59cc1ca8], file:WiredTiger.wt, connection: __posix_open_file, 715: /var/lib/mongodb/data/WiredTiger.wt: handle-open: open: Operationnot permitted Raw: [1611677568:688148][457:0x7f9b59cc1ca8], file:WiredTiger.wt,connection: __posix_open_file, 715: /var/lib/mongodb/data/WiredTiger.wt: handle-open: open: Operation not permitted
2021-01-26T16:12:48.708+0000 E STORAGE [initandlisten] WiredTiger error (1) [1611677568:708810][457:0x7f9b59cc1ca8], file:WiredTiger.wt, connection: __posix_open_file, 715: /var/lib/mongodb/data/WiredTiger.wt: handle-open: open: Operationnot permitted Raw: [1611677568:708810][457:0x7f9b59cc1ca8], file:WiredTiger.wt,connection: __posix_open_file, 715: /var/lib/mongodb/data/WiredTiger.wt: handle-open: open: Operation not permitted
2021-01-26T16:12:48.728+0000 E STORAGE [initandlisten] WiredTiger error (1) [1611677568:728860][457:0x7f9b59cc1ca8], file:WiredTiger.wt, connection: __posix_open_file, 715: /var/lib/mongodb/data/WiredTiger.wt: handle-open: open: Operationnot permitted Raw: [1611677568:728860][457:0x7f9b59cc1ca8], file:WiredTiger.wt,connection: __posix_open_file, 715: /var/lib/mongodb/data/WiredTiger.wt: handle-open: open: Operation not permitted
2021-01-26T16:12:48.744+0000 W STORAGE [initandlisten] Failed to start up WiredTiger under any compatibility version.
2021-01-26T16:12:48.744+0000 F STORAGE [initandlisten] Reason: 1: Operation not permitted
2021-01-26T16:12:48.744+0000 F - [initandlisten] Fatal Assertion 28595 at src/mongo/db/storage/wiredtiger/wiredtiger_kv_engine.cpp 638
2021-01-26T16:12:48.744+0000 F - [initandlisten]
乍一看,人们可能会说"这可能是mongod进程没有读取文件的权限">。但是,当我在调试模式下运行以访问终端时,我可以完全访问路径/var/lib/mongo/data
。
$> id
id=1000230000 gid=0(root) groups=0(root),1000230000
$> cd /var/lib/mongodb/data
/var/lib/mongodb/data$> echo "This is a test" >new_file
/var/lib/mongodb/data$> rm new_file
/var/lib/mongodb/data$> cat WiredTiger.wt | wc -l
23
/var/lib/mongodb/data$> mongod --dbpath $(pwd)
....failed...
上面的命令表明我可以读取/var/lib/mongod/data/WiredTiger.wt
来计算行数,但mongod
进程不能。
只有当我做
# 1000230000 is the random UID and GID granted by OpenShift for the Pod.
$> chown -R 1000230000:nfsnobody /srv/share/openshift/mongo/
…Pod能够读取文件。
要解决这个问题,我还需要检查什么吗?
更新:
- MongoDB版本为4.0.5.
- 添加更多的日志,可以查明错误发生的位置。wiredtiger_kv_engine.cpp。
通过阅读标记r4.0.5的MongoDB源代码,我现在可以理解为什么我得到了错误。
感谢@Alex Blex提供的源代码!
当mongod
试图读取WiredTiger.wt
(或任何其他文件)时,它试图不更新文件的最后访问时间(inode中的st_time
)。这样做的原因是为了提高性能。在底层,它使用带有标志O_NOATIME
的系统调用open()
。
根据open()
手册页:
该标志仅在满足下列条件之一时可用条件为真:
进程的有效UID与所有者UID匹配
调用进程具有CAP_FOWNER能力它的用户名称空间和文件的所有者UID具有命名空间中的映射。
调用失败,报错
EPERM The O_NOATIME flag was specified, but the effective user
ID of the caller did not match the owner of the file and
the caller was not privileged.
在我的例子中,该文件属于nfsnobody
,而不是当前UID,因此出现了错误。这解释了只有通过执行chown $UID:nfsnobody
,问题才会消失。
一些进一步的细节
posix/os_fs.c
试图打开文件时出现错误。在第693行,如果用WT_FS_OPEN_FILE_TYPE_DATA
调用__posix_open_file
,则设置NO_ATIME
标志。