numpy 导入系统错误和段错误只能通过 LSF(OpenBLAS blas_thread_init:pthread_c



当我在通过 LSF 运行的脚本中使用 blas 1.1 导入 numpy 时,它在重复OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailableOpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max实例后出现系统错误和分段错误。 当我在 LSF 外部的同一台机器上具有相同环境时,它会成功。 它也在 blas 1.0 中成功(在 LSF 内部或外部(。

我按如下方式运行bsub

bsub -q short-serial -W 00:01 -R "rusage[mem=1000]" -M 1000 -cwd $HOME -oo ~/test.lsf.out -eo ~/test.lsf.err -J test $HOME/test.sh

test.sh是一个包装器,以确保我在干净的环境中运行test2.sh,以确保无论我在 LSF 内部还是外部运行,情况都相同:

$ cat test.sh
#!/bin/sh
env -i ~/test2.sh --noprofile --norc

test2.sh中,我写出并设置了一些环境信息并运行Python尝试import numpy

$ cat test2.sh
export
ulimit -a
ldconfig -v
export PATH=
.  /group_workspaces/cems2/fiduceo/Users/gholl/anaconda3/etc/profile.d/conda.sh
conda activate
conda activate FCDR
python -c "import numpy; print('Success 1')"
python ~/mwe.py

通过 LSF 运行此操作会产生以下标准输出:

$ cat test.lsf.out
Sender: LSF System <lsfadmin@host334.jc.rl.ac.uk>
Subject: Job 2475445: <test> in cluster <lotus> Exited
Job <test> was submitted from host <host293.jc.rl.ac.uk> by user <gholl> in cluster <lotus> at Wed Jul  4 18:44:42 2018.
Job was executed on host(s) <host334.jc.rl.ac.uk>, in queue <short-serial>, as user <gholl> in cluster <lotus> at Wed Jul  4 18:44:42 2018.
</home/users/gholl> was used as the home directory.
</home/users/gholl> was used as the working directory.
Started at Wed Jul  4 18:44:42 2018.
Terminated at Wed Jul  4 18:44:44 2018.
Results reported at Wed Jul  4 18:44:44 2018.
Your job looked like:
------------------------------------------------------------
# LSBATCH: User input
/home/users/gholl/test.sh
------------------------------------------------------------
Exited with exit code 139.
Resource usage summary:
CPU time :                                   0.60 sec.
Max Memory :                                 -
Average Memory :                             -
Total Requested Memory :                     1000.00 MB
Delta Memory :                               -
Max Swap :                                   -
Max Processes :                              -
Max Threads :                                -
Run time :                                   2 sec.
Turnaround time :                            2 sec.
The output (if any) follows:
export OLDPWD
export PWD="/home/users/gholl"
export SHLVL="1"
core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 1032189
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 4096
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8589930496
cpu time               (seconds, -t) unlimited
max user processes              (-u) 1032189
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

(为简洁起见,省略了ldconfig -v输出(

PS:
Read file </home/users/gholl/test.lsf.err> for stderr output of this job.

对于标准:

$ cat test.lsf.err
/sbin/ldconfig: /etc/ld.so.conf.d/kernel-2.6.32-696.23.1.el6.x86_64.conf:6: duplicate hwcap 1 nosegneg
/sbin/ldconfig: /etc/ld.so.conf.d/kernel-2.6.32-754.el6.x86_64.conf:6: duplicate hwcap 1 nosegneg
/sbin/ldconfig: /opt/platform_mpi/lib/linux_amd64/libhpmpi.so is not an ELF file - it has the wrong magic bytes at the start.
/sbin/ldconfig: Can't create temporary cache file /etc/ld.so.cache~: Permission denied
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/group_workspaces/cems2/fiduceo/Users/gholl/anaconda3/envs/FCDR/lib/python3.6/site-packages/numpy/__init__.py", line 142, in <module>
from . import add_newdocs
File "/group_workspaces/cems2/fiduceo/Users/gholl/anaconda3/envs/FCDR/lib/python3.6/site-packages/numpy/add_newdocs.py", line 13, in <module>
from numpy.lib import add_newdoc
File "/group_workspaces/cems2/fiduceo/Users/gholl/anaconda3/envs/FCDR/lib/python3.6/site-packages/numpy/lib/__init__.py", line 8, in <module>
from .type_check import *
File "/group_workspaces/cems2/fiduceo/Users/gholl/anaconda3/envs/FCDR/lib/python3.6/site-packages/numpy/lib/type_check.py", line 11, in <module>
import numpy.core.numeric as _nx
File "/group_workspaces/cems2/fiduceo/Users/gholl/anaconda3/envs/FCDR/lib/python3.6/site-packages/numpy/core/__init__.py", line 16, in <module>
from . import multiarray
SystemError: initialization of multiarray raised unreported exception
/home/users/gholl/test2.sh: line 8: 52786 Segmentation fault      (core dumped) python -c "import numpy; print('Success 1')"
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 1032189 current, 1032189 max
Traceback (most recent call last):
File "/home/users/gholl/mwe.py", line 2, in <module>
import numpy
File "/group_workspaces/cems2/fiduceo/Users/gholl/anaconda3/envs/FCDR/lib/python3.6/site-packages/numpy/__init__.py", line 142, in <module>
from . import add_newdocs
File "/group_workspaces/cems2/fiduceo/Users/gholl/anaconda3/envs/FCDR/lib/python3.6/site-packages/numpy/add_newdocs.py", line 13, in <module>
from numpy.lib import add_newdoc
File "/group_workspaces/cems2/fiduceo/Users/gholl/anaconda3/envs/FCDR/lib/python3.6/site-packages/numpy/lib/__init__.py", line 8, in <module>
from .type_check import *
File "/group_workspaces/cems2/fiduceo/Users/gholl/anaconda3/envs/FCDR/lib/python3.6/site-packages/numpy/lib/type_check.py", line 11, in <module>
import numpy.core.numeric as _nx
File "/group_workspaces/cems2/fiduceo/Users/gholl/anaconda3/envs/FCDR/lib/python3.6/site-packages/numpy/core/__init__.py", line 16, in <module>
from . import multiarray
SystemError: initialization of multiarray raised unreported exception
/home/users/gholl/test2.sh: line 9: 52789 Segmentation fault      (core dumped) python ~/mwe.py

我也研究了ldconfig -v的输出,但我不知道要寻找什么,放在这里太长了。 但是,我确实比较了是否通过 LSF 运行时排序的输出:

$ diff -u <(sort test.lsf.out_core) <(sort test.nolsf.out_core)
--- /dev/fd/63  2018-07-04 18:56:05.405440986 +0100
+++ /dev/fd/62  2018-07-04 18:56:05.405440986 +0100
@@ -1,4 +1,4 @@
-core file size          (blocks, -c) unlimited
+core file size          (blocks, -c) 0
cpu time               (seconds, -t) unlimited
data seg size           (kbytes, -d) unlimited
export OLDPWD
@@ -102,9 +102,12 @@
libBrokenLocale.so.1 -> libBrokenLocale-2.12.so
libBrokenLocale.so.1 -> libBrokenLocale-2.12.so
libbtf.so.1 -> libbtf.so.1.1.0
+       libbtparser.so.2 -> libbtparser.so.2.2.2
libbz2.so.1 -> libbz2.so.1.0.4
libcairo.so.2 -> libcairo.so.2.10800.8
libcamd.so.2 -> libcamd.so.2.2.0
+       libcanberra-gtk.so.0 -> libcanberra-gtk.so.0.1.5
+       libcanberra.so.0 -> libcanberra.so.0.2.1
libcanna16.so.1 -> libcanna16.so.1.2.0
libcanna.so.1 -> libcanna.so.1.2.0
libcap-ng.so.0 -> libcap-ng.so.0.0.0
@@ -116,7 +119,7 @@
libcdt.so.5 -> libcdt.so.5.0.0
libcfitsio.so.0 -> libcfitsio.so.0
libcgraph.so.6 -> libcgraph.so.6.0.0
-       libcgroup.so.1 -> libcgroup.so.1.0.40
+       libCharLS.so.1 -> libCharLS.so.1.0
libcholmod.so.1 -> libcholmod.so.1.7.1
libcidn.so.1 -> libcidn-2.12.so
libcidn.so.1 -> libcidn-2.12.so
@@ -188,6 +191,7 @@
libeggdbus-1.so.0 -> libeggdbus-1.so.0.0.0
libEGL.so.1 -> libEGL.so.1.0.0
libelf.so.1 -> libelf-0.164.so
+       libenchant.so.1 -> libenchant.so.1.5.0
libepoxy.so.0 -> libepoxy.so.0.0.0
libesoobS.so.2 -> libesoobS.so.2.0.0
libevent-1.4.so.2 -> libevent-1.4.so.2.1.3
@@ -263,6 +267,7 @@
libgmodule-2.0.so.0 -> libgmodule-2.0.so.0.2800.8
libgmp.so.3 -> libgmp.so.3.5.0
libgmpxx.so.4 -> libgmpxx.so.4.1.0
+       libgnomecanvas-2.so.0 -> libgnomecanvas-2.so.0.2600.0
libgnutls-extra.so.26 -> libgnutls-extra.so.26.22.6
libgnutls.so.26 -> libgnutls.so.26.22.6
libgnutlsxx.so.26 -> libgnutlsxx.so.26.14.12
@@ -357,6 +362,7 @@
libgstvideo-0.10.so.0 -> libgstvideo-0.10.so.0.20.0
libgta.so.0 -> libgta.so.0.0.1
libgthread-2.0.so.0 -> libgthread-2.0.so.0.2800.8
+       libgtksourceview-2.0.so.0 -> libgtksourceview-2.0.so.0.0.0
libgtk-x11-2.0.so.0 -> libgtk-x11-2.0.so.0.2400.23
libgtrtst.so.2 -> libgtrtst.so.2.0.0
libgudev-1.0.so.0 -> libgudev-1.0.so.0.0.1
@@ -378,9 +384,9 @@
libhwloc.so.4 -> libhwloc.so
/lib/i686: (hwcap: 0x0008000000000000)
/lib/i686/nosegneg: (hwcap: 0x0028000000000000)
-       libibmad.so.5 -> libibmad.so.5.4.0
+       libibmad.so.5 -> libibmad.so.5.5.0
libibnetdisc.so.5 -> libibnetdisc.so.5.3.0
-       libibumad.so.3 -> libibumad.so.3.0.4
+       libibumad.so.3 -> libibumad.so.3.1.0
libibverbs.so.1 -> libibverbs.so.1.0.0
libICE.so.6 -> libICE.so.6.3.0
libicudata.so.42 -> libicudata.so.42.1
@@ -499,6 +505,7 @@
libnih.so.1 -> libnih.so.1.0.0
libnl.so.1 -> libnl.so.1.1.4
libnn.so.2 -> libnn.so.2.0.0
+       libnotify.so.1 -> libnotify.so.1.2.3
libnsl.so.1 -> libnsl-2.12.so
libnsl.so.1 -> libnsl-2.12.so
libnspr4.so -> libnspr4.so
@@ -542,17 +549,17 @@
libopcodes-2.20.51.0.2-5.48.el6.so -> libopcodes-2.20.51.0.2-5.48.el6.so
libopenjp2.so.7 -> libopenjp2.so.2.3.0
libopenjpeg.so.2 -> libopenjpeg.so.2.1.3.0
-       libopensm.so.5 -> libopensm.so.5.2.0
+       libopensm.so.12 -> libopensm.so.12.0.0
liboplodbcS.so.2 -> liboplodbcS.so.2.0.0
liboraodbcS.so.2 -> liboraodbcS.so.2.0.0
libORBit-2.so.0 -> libORBit-2.so.0.1.0
libORBitCosNaming-2.so.0 -> libORBitCosNaming-2.so.0.1.0
libORBit-imodule-2.so.0 -> libORBit-imodule-2.so.0.0.0
-       libosmcomp.so.3 -> libosmcomp.so.3.0.8
+       libosmcomp.so.3 -> libosmcomp.so.3.0.6
libOSMesa16.so.6 -> libOSMesa16.so.6.5.3
libOSMesa32.so.6 -> libOSMesa32.so.6.5.3
libOSMesa.so.6 -> libOSMesa.so.6.5.3
-       libosmvendor.so.3 -> libosmvendor.so.3.0.9
+       libosmvendor.so.3 -> libosmvendor.so.3.0.8
libossp-uuid.so.16 -> libossp-uuid.so.16.0.21
libotf.so.0 -> libotf.so.0.0.0
libp11-kit.so.0 -> libp11-kit.so.0.0.0
@@ -676,7 +683,6 @@
libsensors.so.4 -> libsensors.so.4.2.0
libsepol.so.1 -> libsepol.so.1
libserf-1.so.1 -> libserf-1.so.1.3.0
-       libsgutils2.so.2 -> libsgutils2.so.2.0.0
libshiboken-python2.7.so.1.2 -> libshiboken-python2.7.so.1.2.1
libshp.so.1 -> libshp.so.1.0.1
libslang.so.2 -> libslang.so.2.2.1
@@ -686,6 +692,7 @@
libsndfile.so.1 -> libsndfile.so.1.0.20
libsnmp.so.20 -> libsnmp.so.20.0.0
libsoftokn3.so -> libsoftokn3.so
+       libspatialite.so.2 -> libspatialite.so.2.0.4
libspqr.so.1 -> libspqr.so.1.1.2
libsqlite3.so.0 -> libsqlite3.so.0.8.6
libssh2.so.1 -> libssh2.so.1.0.1
@@ -754,6 +761,7 @@
libvorbisenc.so.2 -> libvorbisenc.so.2.0.6
libvorbisfile.so.3 -> libvorbisfile.so.3.3.2
libvorbis.so.0 -> libvorbis.so.0.4.3
+       libvpx.so.1 -> libvpx.so.1.3.0
libvte.so.9 -> libvte.so.9.2501.0
libwbclient.so.0 -> libwbclient.so.0
libwebpdecoder.so.1 -> libwebpdecoder.so.1.0.3
@@ -764,6 +772,7 @@
libwlm-nosched.so -> libwlm-nosched.so
libwmf-0.2.so.7 -> libwmf-0.2.so.7.1.0
libwmflite-0.2.so.7 -> libwmflite-0.2.so.7.0.1
+       libwnck-1.so.22 -> libwnck-1.so.22.3.23
libwrap.so.0 -> libwrap.so.0.7.6
libwx_baseu-2.8.so.0 -> libwx_baseu-2.8.so.0.8.0
libwx_baseu-3.0.so.0 -> libwx_baseu-3.0.so.0.2.0
@@ -800,6 +809,7 @@
libX11.so.6 -> libX11.so.6.3.0
libX11-xcb.so.1 -> libX11-xcb.so.1.0.0
libX11-xcb.so.1 -> libX11-xcb.so.1.0.0
+       libx86.so.1 -> libx86.so.1
libXau.so.6 -> libXau.so.6.0.0
libXau.so.6 -> libXau.so.6.0.0
libXaw3d.so.7 -> libXaw3d.so.7.0
@@ -842,6 +852,7 @@
libxcb.so.1 -> libxcb.so.1.1.0
libxcb-sync.so.1 -> libxcb-sync.so.1.0.0
libxcb-sync.so.1 -> libxcb-sync.so.1.0.0
+       libxcb-util.so.1 -> libxcb-util.so.1.0.0
libxcb-xevie.so.0 -> libxcb-xevie.so.0.0.0
libxcb-xevie.so.0 -> libxcb-xevie.so.0.0.0
libxcb-xf86dri.so.0 -> libxcb-xf86dri.so.0.0.0
@@ -889,6 +900,7 @@
libXp.so.6 -> libXp.so.6.2.0
libXrandr.so.2 -> libXrandr.so.2.2.0
libXrender.so.1 -> libXrender.so.1.3.0
+       libXRes.so.1 -> libXRes.so.1.0.0
libxslt.so.1 -> libxslt.so.1.1.26
libxtables.so.4 -> libxtables.so.4.0.0-1.4.7
libXt.so.6 -> libXt.so.6.0.0
@@ -897,20 +909,21 @@
libXxf86dga.so.1 -> libXxf86dga.so.1.0.0
libXxf86misc.so.1 -> libXxf86misc.so.1.1.0
libXxf86vm.so.1 -> libXxf86vm.so.1.0.0
-       libyaml-0.so.2 -> libyaml-0.so.2.0.4
libz.so.1 -> libz.so.1.2.3
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
-max user processes              (-u) 1032189
-open files                      (-n) 4096
+max user processes              (-u) 1024
+open files                      (-n) 48000
/opt/platform_mpi/lib/linux_amd64:
p11-kit-trust.so -> libnssckbi.so
-pending signals                 (-i) 1032189
+pending signals                 (-i) 515955
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
scheduling priority             (-e) 0
-stack size              (kbytes, -s) 8589930496
+stack size              (kbytes, -s) 2097151
+Success
+Success 1
/usr/lib:
/usr/lib64:
/usr/lib64/atlas:

当我在LSF外面运行时,标准输出是

$ env -i ~/test2.sh --noprofile --norc
export OLDPWD
export PWD="/home/users/gholl"
export SHLVL="1"
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 515955
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 48000
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 2097151
cpu time               (seconds, -t) unlimited
max user processes              (-u) 1024
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

(为简洁起见,省略ldconfig -v的输出( 成功 1 成功

并且 stderr 的输出仅限于与以前相同的 ldconfig 错误:

/sbin/ldconfig: /etc/ld.so.conf.d/kernel-2.6.32-696.23.1.el6.x86_64.conf:6: duplicate hwcap 1 nosegneg
/sbin/ldconfig: /etc/ld.so.conf.d/kernel-2.6.32-754.el6.x86_64.conf:6: duplicate hwcap 1 nosegneg
/sbin/ldconfig: /opt/platform_mpi/lib/linux_amd64/libhpmpi.so is not an ELF file - it has the wrong magic bytes at the start.
/sbin/ldconfig: Can't create temporary cache file /etc/ld.so.cache~: Permission denied

我正在运行Python 3.6.3,其conda环境主要来自anacondaconda-forge。 我之前注意到,当我设置一个紧ulimit -v时,import numpy以相同的OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable失败。 但目前,还没有ulimit -v集。 LSF 和非 LSF 情况之间唯一ulimit区别是,对于多个属性,LSF 内部的限制比 LSF 外部的限制要大得多,因此在这种情况下,我看不出 ulimit 限制如何导致 LSF 中的失败。 在我之前的情况下,我也设法在 LSF 之外重现了这个问题。

为什么我可以直接导入 numpy,但是在通过 LSF 运行时是否遇到系统错误和分段错误? 我可以查看什么来进一步调试它?

我没有 LSF 管理员访问权限。

我不知道为什么,但是当我设置时一切正常

export OMP_NUM_THREADS=1
export USE_SIMPLE_THREADED_LEVEL3= 1

最新更新