我创建了一个自定义环境用于使用tf-agents
进行强化学习(不需要回答这个问题(,如果我通过将num_parallel_environments
设置为1
来实例化一个线程,它可以正常工作,但是当我将num_parallel_environments
增加到50
时,会抛出不常见且看似随机的错误,例如random.shuffle()
内部的IndexError
。代码如下:
内部 train.py
tf_env = tf_py_environment.TFPyEnvironment(
batched_py_environment.BatchedPyEnvironment(
[environment.CardGameEnv()] * num_parallel_environments))
在我的环境中,这是在线程中运行的
self.cardStack = getFullDeck()
random.shuffle(self.cardStack)
这是一个普通函数,在每个线程类中导入
def getFullDeck():
deck = []
for rank in Ranks:
for suit in Suits:
deck.append(Card(rank, suit))
return deck
这是可能的错误之一:
Traceback (most recent call last):
File "e:Userstmp.vscodeextensionsms-python.python-2019.1.0pythonFilesptvsd_launcher.py", line 45, in <module>
main(ptvsdArgs)
File "e:Userstmp.vscodeextensionsms-python.python-2019.1.0pythonFileslibpythonptvsd__main__.py", line 348, in main
run()
File "e:Userstmp.vscodeextensionsms-python.python-2019.1.0pythonFileslibpythonptvsd__main__.py", line 253, in run_file
runpy.run_path(target, run_name='__main__')
File "C:Python37librunpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "C:Python37librunpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "C:Python37librunpy.py", line 85, in _run_code
exec(code, run_globals)
File "e:UserstmpDocumentsProgrammingNeural NetsPoker_AItrain_v2.py", line 320, in <module>
app.run(main)
File "C:Python37libsite-packagesabslapp.py", line 300, in run
_run_main(main, args)
File "C:Python37libsite-packagesabslapp.py", line 251, in _run_main
sys.exit(main(argv))
File "e:UserstmpDocumentsProgrammingNeural NetsPoker_AItrain_v2.py", line 315, in main
num_eval_episodes=FLAGS.num_eval_episodes)
File "E:UserstmpAppDataRoamingPythonPython37site-packagesginconfig.py", line 1032, in wrapper
utils.augment_exception_message_and_reraise(e, err_str)
File "E:UserstmpAppDataRoamingPythonPython37site-packagesginutils.py", line 49, in augment_exception_message_and_reraise
six.raise_from(proxy.with_traceback(exception.__traceback__), None)
File "<string>", line 3, in raise_from
File "E:UserstmpAppDataRoamingPythonPython37site-packagesginconfig.py", line 1009, in wrapper
return fn(*new_args, **new_kwargs)
File "e:UserstmpDocumentsProgrammingNeural NetsPoker_AItrain_v2.py", line 251, in train_eval
collect_driver.run()
File "C:Python37libsite-packagestf_agentsdriversdynamic_episode_driver.py", line 149, in run
maximum_iterations=maximum_iterations)
File "C:Python37libsite-packagestf_agentsutilscommon.py", line 111, in with_check_resource_vars
return fn(*fn_args, **fn_kwargs)
File "C:Python37libsite-packagestf_agentsdriversdynamic_episode_driver.py", line 180, in _run
name='driver_loop'
File "C:Python37libsite-packagestensorflowpythonopscontrol_flow_ops.py", line 2457, in while_loop_v2
return_same_structure=True)
File "C:Python37libsite-packagestensorflowpythonopscontrol_flow_ops.py", line 2689, in while_loop
loop_vars = body(*loop_vars)
File "C:Python37libsite-packagestf_agentsdriversdynamic_episode_driver.py", line 103, in loop_body
next_time_step = self.env.step(action_step.action)
File "C:Python37libsite-packagestf_agentsenvironmentstf_environment.py", line 232, in step
return self._step(action)
File "C:Python37libsite-packagestensorflowpythonautographimplapi.py", line 232, in graph_wrapper
return func(*args, **kwargs)
File "C:Python37libsite-packagestf_agentsenvironmentstf_py_environment.py", line 218, in _step
_step_py, flat_actions, self._time_step_dtypes, name='step_py_func')
File "C:Python37libsite-packagestensorflowpythonopsscript_ops.py", line 488, in numpy_function
return py_func_common(func, inp, Tout, stateful=True, name=name)
File "C:Python37libsite-packagestensorflowpythonopsscript_ops.py", line 452, in py_func_common
result = func(*[x.numpy() for x in inp])
File "C:Python37libsite-packagestf_agentsenvironmentstf_py_environment.py", line 203, in _step_py
self._time_step = self._env.step(packed)
File "C:Python37libsite-packagestf_agentsenvironmentspy_environment.py", line 174, in step
self._current_time_step = self._step(action)
File "C:Python37libsite-packagestf_agentsenvironmentsbatched_py_environment.py", line 140, in _step
zip(self._envs, unstacked_actions))
File "C:Python37libmultiprocessingpool.py", line 268, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "C:Python37libmultiprocessingpool.py", line 657, in get
raise self._value
File "C:Python37libmultiprocessingpool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "C:Python37libmultiprocessingpool.py", line 44, in mapstar
return list(map(*args))
File "C:Python37libsite-packagestf_agentsenvironmentsbatched_py_environment.py", line 139, in <lambda>
lambda env_action: env_action[0].step(env_action[1]),
File "C:Python37libsite-packagestf_agentsenvironmentspy_environment.py", line 174, in step
self._current_time_step = self._step(action)
File "e:UserstmpDocumentsProgrammingNeural NetsPoker_AIenvironment.py", line 116, in _step
canRoundContinue = self._table.runUntilChoice(action)
File "e:UserstmpDocumentsProgrammingNeural NetsPoker_AItable.py", line 326, in runUntilChoice
random.shuffle(self.cardStack)
File "C:Python37librandom.py", line 278, in shuffle
x[i], x[j] = x[j], x[i]
IndexError: list index out of range
In call to configurable 'train_eval' (<function train_eval at 0x000002722713A158>)
我怀疑发生此错误是因为线程同时更改数组,但我不明白为什么会这样:
一切都发生在类实例中,并且每次调用函数时都会重新创建getFullDeck()
返回的数组,因此不应该有多个线程可以访问相同的引用, 右?
tf_env = tf_py_environment.TFPyEnvironment(
batched_py_environment.BatchedPyEnvironment(
[environment.CardGameEnv()] * num_parallel_environments))
您正在为每个并行实例重用相同的环境,而不是为每个实例创建新环境。你可能想尝试类似的东西
tf_env = tf_py_environment.TFPyEnvironment(
batched_py_environment.BatchedPyEnvironment(
[environment.CardGameEnv() for _ in range(num_parallel_environments)]))