如何使用OpenAi-Gym和Scoop产生可重复的随机性?
我希望每次重复该示例时都能获得完全相同的结果。如果可能的话,我希望它适用于使用随机性提供程序(例如随机和np.random(的现有库,这可能是一个问题,因为它们通常使用全局随机状态并且不提供本地随机状态的接口
我的示例脚本如下所示:
import random
import numpy as np
from scoop import futures
import gym
def do(it):
random.seed(it)
np.random.seed(it)
env.seed(it)
env.action_space.seed(it)
env.reset()
observations = []
for i in range(3):
while True:
action = env.action_space.sample()
ob, reward, done, _ = env.step(action)
observations.append(ob)
if done:
break
return observations
env = gym.make("BipedalWalker-v3")
if __name__ == "__main__":
maxit = 20
results1 = futures.map(do, range(2, maxit))
results2 = futures.map(do, range(2, maxit))
for a,b in zip(results1, results2):
if np.array_equiv(a, b):
print("equal, yay")
else:
print("not equal :(")
预期输出:每行equal, yay
实际输出:多条线路上的not equal :(
全输出:
/home/chef/.venv/neuro/bin/python -m scoop /home/chef/dev/projekte/NeuroEvolution-CTRNN_new/random_test.py
[2020-05-18 18:05:03,578] launcher INFO SCOOP 0.7 1.1 on linux using Python 3.8.2 (default, Apr 27 2020, 15:53:34) [GCC 9.3.0], API: 1013
[2020-05-18 18:05:03,578] launcher INFO Deploying 4 worker(s) over 1 host(s).
[2020-05-18 18:05:03,578] launcher INFO Worker distribution:
[2020-05-18 18:05:03,578] launcher INFO 127.0.0.1: 3 + origin
/home/chef/.venv/neuro/lib/python3.8/site-packages/gym/logger.py:30: UserWarning: WARN: Box bound precision lowered by casting to float32
warnings.warn(colorize('%s: %s'%('WARN', msg % args), 'yellow'))
/home/chef/.venv/neuro/lib/python3.8/site-packages/gym/logger.py:30: UserWarning: WARN: Box bound precision lowered by casting to float32
warnings.warn(colorize('%s: %s'%('WARN', msg % args), 'yellow'))
/home/chef/.venv/neuro/lib/python3.8/site-packages/gym/logger.py:30: UserWarning: WARN: Box bound precision lowered by casting to float32
warnings.warn(colorize('%s: %s'%('WARN', msg % args), 'yellow'))
/home/chef/.venv/neuro/lib/python3.8/site-packages/gym/logger.py:30: UserWarning: WARN: Box bound precision lowered by casting to float32
warnings.warn(colorize('%s: %s'%('WARN', msg % args), 'yellow'))
equal, yay
not equal :(
not equal :(
not equal :(
not equal :(
not equal :(
equal, yay
not equal :(
equal, yay
equal, yay
equal, yay
equal, yay
equal, yay
not equal :(
equal, yay
equal, yay
equal, yay
not equal :(
[2020-05-18 18:05:08,554] launcher (127.0.0.1:37729) INFO Root process is done.
[2020-05-18 18:05:08,554] launcher (127.0.0.1:37729) INFO Finished cleaning spawned subprocesses.
Process finished with exit code 0
当我在没有勺子的情况下运行此示例时,我得到了几乎完美的结果:
/home/chef/.venv/neuro/bin/python /home/chef/dev/projekte/NeuroEvolution-CTRNN_new/random_test.py
/home/chef/.venv/neuro/lib/python3.8/site-packages/gym/logger.py:30: UserWarning: WARN: Box bound precision lowered by casting to float32
warnings.warn(colorize('%s: %s'%('WARN', msg % args), 'yellow'))
/home/chef/.venv/neuro/lib/python3.8/site-packages/scoop/fallbacks.py:38: RuntimeWarning: SCOOP was not started properly.
Be sure to start your program with the '-m scoop' parameter. You can find further information in the documentation.
Your map call has been replaced by the builtin serial Python map().
warnings.warn(
not equal :(
equal, yay
equal, yay
equal, yay
equal, yay
equal, yay
equal, yay
equal, yay
equal, yay
equal, yay
equal, yay
equal, yay
equal, yay
equal, yay
equal, yay
equal, yay
equal, yay
equal, yay
Process finished with exit code 0
我可以通过将健身房的创建移动到do-function来"解决"它。
完整更正的代码如下所示:
import random
import numpy as np
from scoop import futures
import gym
def do(it):
env = gym.make("BipedalWalker-v3")
random.seed(it)
np.random.seed(it)
env.seed(it)
env.action_space.seed(it)
env.reset()
observations = []
for i in range(3):
while True:
action = env.action_space.sample()
ob, reward, done, _ = env.step(action)
observations.append(ob)
if done:
break
return observations
if __name__ == "__main__":
maxit = 20
results1 = futures.map(do, range(2, maxit))
results2 = futures.map(do, range(2, maxit))
for a,b in zip(results1, results2):
if np.array_equiv(a, b):
print("equal, yay")
else:
print("not equal :(")