在芹菜工作线程中使用Theano共享变量时出现错误



我有一个名为simple_theano_tasks的芹菜任务:

@app.task(bind=True, queue='test')
def simple_theano_tasks(self):
  import theano, numpy as np
  my_array = np.zeros((0,), dtype=theano.config.floatX)
  shared = theano.shared(my_array, name='my_variable', borrow=True)
  print 'Done. Shared value is {}'.format(shared.get_value())

当THEANO 配置使用CPU时,一切工作如预期(无错误):

$ THEANO_FLAGS=device=cpu celery -A my_project worker -c1 -l info -Q test
[INFO/MainProcess] Received task: my_project.tasks.simple_theano_tasks[xxxx]
[WARNING/Worker-1] Done. Shared value is []
[INFO/MainProcess] Task my_project.tasks.simple_theano_tasks[xxxx] succeeded in 0.00407959899985s

现在,当我在启用GPU的情况下做同样的事情时,Theano(或CUDA)会引发一个错误:

$ THEANO_FLAGS=device=gpu celery -A my_project worker -c1 -l info -Q test
 ...
 Using gpu device 0: GeForce GTX 670M (CNMeM is enabled)
 ...
[INFO/MainProcess] Received task: my_project.tasks.simple_theano_tasks[xxx]
[ERROR/MainProcess] Task my_project.tasks.simple_theano_tasks[xxx] raised unexpected: RuntimeError("Cuda error 'initialization error' while copying %lli data element to device memory",)
Traceback (most recent call last):
  File "/.../local/lib/python2.7/site-packages/celery/app/trace.py", line 240, in trace_task
R = retval = fun(*args, **kwargs)
  File "/.../local/lib/python2.7/site-packages/celery/app/trace.py", line 438, in __protected_call__
return self.run(*args, **kwargs)
  File "/.../my_project/tasks.py", line 362, in simple_theano_tasks
shared = theano.shared(my_array, name='my_variable', borrow=True)
  File "/.../local/lib/python2.7/site-packages/theano/compile/sharedvalue.py", line 247, in shared
allow_downcast=allow_downcast, **kwargs)
  File "/.../local/lib/python2.7/site-packages/theano/sandbox/cuda/var.py", line 229, in float32_shared_constructor
deviceval = type_support_filter(value, type.broadcastable, False, None)
RuntimeError: Cuda error 'initialization error' while copying %lli data element to device memory

最后,当我在Python shell中运行完全相同的代码时,我没有错误:

$ THEANO_FLAGS=device=gpu python
Python 2.7.6 (default, Mar 22 2014, 22:59:56) 
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import theano, numpy as np
Using gpu device 0: GeForce GTX 670M (CNMeM is enabled)
>>> my_array = np.zeros((0,), dtype=theano.config.floatX)
>>> shared = theano.shared(my_array, name='my_variable', borrow=True)
>>> print 'Done. Shared value is {}'.format(shared.get_value())
Done. Shared value is []

有没有人知道:

    为什么在一个Celery worker中,ano的行为不同?
  • 如何解决这个问题?

一些额外的上下文:

  • 我使用theano@0.7.0和Celery@3.1.18

  • "~/.theanorc"文件

    [global]
    floatX=float32
    device=gpu
    [mode]=FAST_RUN
    [nvcc]
    fastmath=True
    [lib]
    cnmem=0.1
    [cuda]
    root=/usr/local/cuda
    

一个变通办法是:

  1. 指定CPU作为目标设备(在"。"THEANO_FLAGS=device=cpu")
  2. 之后,覆盖指定的设备到指定的GPU

芹菜任务现在:

@app.task(bind=True, queue='test')
def simple_theano_tasks(self):
  # At this point, no theano import statements have been processed, and so the device is unbound
  import theano, numpy as np
  import theano.sandbox.cuda
  theano.sandbox.cuda.use('gpu') # enable gpu
  my_array = np.zeros((0,), dtype=theano.config.floatX)
  shared = theano.shared(my_array, name='my_variable', borrow=True)
  print 'Done. Shared value is {}'.format(shared.get_value())

注意:我找到了解决方案阅读这篇文章关于使用多个GPU

相关内容

  • 没有找到相关文章

最新更新