使用包中的并行 python 和方法获取"ImportError: No module named"



我正在尝试使用并行python来进行一些分布式基准测试(本质上是在中央服务器的一组机器上协调和运行一些代码)。我的代码运行得非常好,直到我将功能转移到一个单独的包中。从那时起,我一直得到ImportError: No module named some.module.pp_test

我的问题实际上有两个:有人遇到过pp的这个问题吗?如果有,如何解决?我尝试使用dillimport dill),但没有帮助。此外,是否有一个很好的替代品来代替并行Python,它不需要任何额外的基础设施?

我得到的确切错误是:

RUNNING TEST
Waiting for hosts to finish booting....A fatal error has occured during the function execution
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/ppworker.py", line 86, in run
    __args = pickle.loads(__sargs)
ImportError: No module named some.module.pp_test
Caught exception in the run phase 'NoneType' object is not iterable
Traceback (most recent call last):
  File "test.py", line 5, in <module>
    p.ping_pong()
  File "/home/ubuntu/workspace/pp-test/some/module/pp_test.py", line 5, in ping_pong
    a_test.run()
  File "/home/ubuntu/workspace/pp-test/some/module/pp_test.py", line 27, in run
    pong, hostname = ping()
TypeError: 'NoneType' object is not iterable

代码的结构是这样的:

pp-test/
       test.py
       some/
            __init__.py
            module/
                   __init__.py
                   pp_test.py

test.py实现为:

from some.module.pp_test import MWE
p = MWE()
p.ping_pong()

pp_test.py是:

class MWE():
  def ping_pong(self):
    print "RUNNING TEST "
    a_test = PPTester()
    a_test.run()
import pp
import time
from sys import stdout, exit
class PPTester(object):
  def run(self):
    try:
        ppservers = ('10.10.10.10', )
        time.sleep(5)
        job_server = pp.Server(0, ppservers=ppservers)
        stdout.write("Waiting for hosts to finish booting...")
        while len(job_server.get_active_nodes()) - 1 < len(ppservers):
            stdout.write(".")
            stdout.flush()
            time.sleep(1)
        ppmodules = ()
        pings = [(server, job_server.submit(self.run_pong, modules=ppmodules)) for server in ppservers]
        for server, ping in pings:
            pong, hostname = ping()
            print "Host ", hostname, " is alive!"
        print "All servers booted up, starting benchmarks..."
        job_server.print_stats()
    except Exception as e:
        print "Caught exception in the run phase", e
        raise
    pass
  def run_pong(self):
    import subprocess
    p = subprocess.Popen("hostname", stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True)
    (output, err) = p.communicate()
    p_status = p.wait()
    return "pong ", output

dill无法开箱即用地使用pp,因为pp不会序列化python对象——pp提取对象的源代码(就像标准python库中的inspect模块)。

要使pp能够使用dill(实际上是dill.source,它是由dill扩充的inspect),您必须使用pp的一个名为ppft的分支。ppft安装为pp(即使用import pp导入),但它具有更强的源代码检查功能,因此您可以自动"序列化"大多数python对象,并让ppft自动跟踪它们的依赖关系。

在此处获取ppft:https://github.com/uqfoundation

ppft还可安装pip,并兼容python 3.x

最新更新