在光线设置过程中激活Conda环境



我正在尝试启动本地Ray集群,但初始化和设置命令会引发错误,我不确定它们的含义。

对于每个命令,执行后会显示以下消息(完整日志显示在下面(:

bash: cannot set terminal process group (-1): Inappropriate ioctl for device
bash: no job control in this shell

它们似乎不会阻止一些命令成功执行,但我无法使用在每个节点上激活conda环境

# List of shell commands to run to set up each nodes.
setup_commands:
- conda activate pytorch-dev

如有任何帮助或解释,我们将不胜感激。

我的集群配置文件(cluster_config_local.yaml(包含:

# An unique identifier for the head node and workers of this cluster.
cluster_name: default
## NOTE: Typically for local clusters, min_workers == initial_workers == max_workers.
# The minimum number of workers nodes to launch in addition to the head
# node. This number should be >= 0.
# Typically, min_workers == initial_workers == max_workers.
min_workers: 12
# The initial number of worker nodes to launch in addition to the head node.
# Typically, min_workers == initial_workers == max_workers.
initial_workers: 12
# The maximum number of workers nodes to launch in addition to the head node.
# This takes precedence over min_workers.
# Typically, min_workers == initial_workers == max_workers.
max_workers: 12
# Autoscaling parameters.
# Ignore this if min_workers == initial_workers == max_workers.
autoscaling_mode: default
target_utilization_fraction: 0.8
idle_timeout_minutes: 5
# This executes all commands on all nodes in the docker container,
# and opens all the necessary ports to support the Ray cluster.
# Empty string means disabled. Assumes Docker is installed.
docker:
image: "" # e.g., tensorflow/tensorflow:1.5.0-py3
container_name: "" # e.g. ray_docker
run_options: []  # Extra options to pass into "docker run"
# Local specific configuration.
provider:
type: local
head_ip: cs19090bs #Lab 3, machine 311
worker_ips: [
cs19091bs, cs19093bs, cs19094bs, cs19095bs, cs19096bs,
cs19103bs, cs19102bs, cs19101bs, cs19100bs, cs19099bs, cs19098bs, cs19097bs
]
# How Ray will authenticate with newly launched nodes.
auth:
ssh_user: user
ssh_private_key: ~/.ssh/id_rsa
# Leave this empty.
head_node: {}
# Leave this empty.
worker_nodes: {}
# Files or directories to copy to the head and worker nodes. The format is a
# dictionary from REMOTE_PATH: LOCAL_PATH, e.g.
file_mounts: {
#    "/path1/on/remote/machine": "/path1/on/local/machine",
#    "/path2/on/remote/machine": "/path2/on/local/machine",
}
# List of commands that will be run before `setup_commands`. If docker is
# enabled, these commands will run outside the container and before docker
# is setup.
initialization_commands: []
# List of shell commands to run to set up each nodes.
setup_commands:
- conda activate pytorch-dev
# Custom commands that will be run on the head node after common setup.
head_setup_commands: []
# Custom commands that will be run on worker nodes after common setup.
worker_setup_commands: []
# Command to start ray on the head node. You don't need to change this.
head_start_ray_commands:
- ray stop
- ulimit -c unlimited && ray start --head --redis-port=6379 --autoscaling-config=~/ray_bootstrap_config.yaml
# Command to start ray on worker nodes. You don't need to change this.
worker_start_ray_commands:
- ray stop
- ray start --redis-address=$RAY_HEAD_IP:6379

当我执行ray up cluster_config_local.yaml时显示的完整日志是:

2019-11-11 10:18:06,930 INFO node_provider.py:41 -- ClusterState: Loaded cluster state: ['cs19091bs', 'cs19093bs', 'cs19094bs', 'cs19095bs', 'cs19096bs', 'cs19090bs', 'cs19103bs', 'cs19102bs', 'cs19101bs', 'cs19100bs', 'cs19099bs', 'cs19098bs', 'cs19097bs']
This will create a new cluster [y/N]: y
2019-11-11 10:18:08,413 INFO commands.py:201 -- get_or_create_head_node: Launching new head node...
2019-11-11 10:18:08,414 INFO node_provider.py:85 -- ClusterState: Writing cluster state: ['cs19091bs', 'cs19093bs', 'cs19094bs', 'cs19095bs', 'cs19096bs', 'cs19090bs', 'cs19103bs', 'cs19102bs', 'cs19101bs', 'cs19100bs', 'cs19099bs', 'cs19098bs', 'cs19097bs']
2019-11-11 10:18:08,416 INFO commands.py:214 -- get_or_create_head_node: Updating files on head node...
2019-11-11 10:18:08,417 INFO updater.py:356 -- NodeUpdater: cs19090bs: Updating to 345f31e4c980153f1c40ae2c0be26b703d4bbfde
2019-11-11 10:18:08,419 INFO node_provider.py:85 -- ClusterState: Writing cluster state: ['cs19091bs', 'cs19093bs', 'cs19094bs', 'cs19095bs', 'cs19096bs', 'cs19090bs', 'cs19103bs', 'cs19102bs', 'cs19101bs', 'cs19100bs', 'cs19099bs', 'cs19098bs', 'cs19097bs']
2019-11-11 10:18:08,419 INFO updater.py:398 -- NodeUpdater: cs19090bs: Waiting for remote shell...
2019-11-11 10:18:08,420 INFO updater.py:210 -- NodeUpdater: cs19090bs: Waiting for IP...
2019-11-11 10:18:08,429 INFO log_timer.py:21 -- NodeUpdater: cs19090bs: Got IP [LogTimer=9ms]
2019-11-11 10:18:08,442 INFO updater.py:262 -- NodeUpdater: cs19090bs: Running uptime on 132.181.15.173...
bash: cannot set terminal process group (-1): Inappropriate ioctl for device
bash: no job control in this shell
10:18:10 up 4 days, 22:41,  1 user,  load average: 1.14, 0.56, 0.38
2019-11-11 10:18:10,178 INFO log_timer.py:21 -- NodeUpdater: cs19090bs: Got remote shell [LogTimer=1759ms]
2019-11-11 10:18:10,181 INFO node_provider.py:85 -- ClusterState: Writing cluster state: ['cs19091bs', 'cs19093bs', 'cs19094bs', 'cs19095bs', 'cs19096bs', 'cs19090bs', 'cs19103bs', 'cs19102bs', 'cs19101bs', 'cs19100bs', 'cs19099bs', 'cs19098bs', 'cs19097bs']
2019-11-11 10:18:10,182 INFO updater.py:262 -- NodeUpdater: cs19090bs: Running mkdir -p ~ on 132.181.15.173...
bash: cannot set terminal process group (-1): Inappropriate ioctl for device
bash: no job control in this shell
2019-11-11 10:18:11,640 INFO updater.py:460 -- NodeUpdater: cs19090bs: Syncing /tmp/ray-bootstrap-aomvoo_d to ~/ray_bootstrap_config.yaml...
sending incremental file list
ray-bootstrap-aomvoo_d
sent 120 bytes  received 47 bytes  111.33 bytes/sec
total size is 1,063  speedup is 6.37
2019-11-11 10:18:12,147 INFO log_timer.py:21 -- NodeUpdater: cs19090bs: Synced /tmp/ray-bootstrap-aomvoo_d to ~/ray_bootstrap_config.yaml [LogTimer=1964ms]
2019-11-11 10:18:12,147 INFO updater.py:262 -- NodeUpdater: cs19090bs: Running mkdir -p ~ on 132.181.15.173...
bash: cannot set terminal process group (-1): Inappropriate ioctl for device
bash: no job control in this shell
2019-11-11 10:18:13,610 INFO updater.py:460 -- NodeUpdater: cs19090bs: Syncing /home/cosc/student/atu31/.ssh/id_rsa to ~/ray_bootstrap_key.pem...
sending incremental file list
sent 60 bytes  received 12 bytes  48.00 bytes/sec
total size is 3,243  speedup is 45.04
2019-11-11 10:18:14,131 INFO log_timer.py:21 -- NodeUpdater: cs19090bs: Synced /home/cosc/student/atu31/.ssh/id_rsa to ~/ray_bootstrap_key.pem [LogTimer=1984ms]
2019-11-11 10:18:14,133 INFO node_provider.py:85 -- ClusterState: Writing cluster state: ['cs19091bs', 'cs19093bs', 'cs19094bs', 'cs19095bs', 'cs19096bs', 'cs19090bs', 'cs19103bs', 'cs19102bs', 'cs19101bs', 'cs19100bs', 'cs19099bs', 'cs19098bs', 'cs19097bs']
2019-11-11 10:18:14,134 INFO log_timer.py:21 -- NodeUpdater: cs19090bs: Initialization commands completed [LogTimer=0ms]
2019-11-11 10:18:14,134 INFO updater.py:262 -- NodeUpdater: cs19090bs: Running conda activate pytorch-dev on 132.181.15.173...
bash: cannot set terminal process group (-1): Inappropriate ioctl for device
bash: no job control in this shell
2019-11-11 10:18:15,740 INFO log_timer.py:21 -- NodeUpdater: cs19090bs: Setup commands completed [LogTimer=1605ms]
2019-11-11 10:18:15,740 INFO updater.py:262 -- NodeUpdater: cs19090bs: Running ray stop on 132.181.15.173...
bash: cannot set terminal process group (-1): Inappropriate ioctl for device
bash: no job control in this shell
2019-11-11 10:18:17,809 INFO updater.py:262 -- NodeUpdater: cs19090bs: Running ulimit -c unlimited && ray start --head --redis-port=6379 --autoscaling-config=~/ray_bootstrap_config.yaml on 132.181.15.173...
bash: cannot set terminal process group (-1): Inappropriate ioctl for device
bash: no job control in this shell
2019-11-11 10:18:19,923 INFO scripts.py:303 -- Using IP address 132.181.15.173 for this node.
2019-11-11 10:18:19,924 INFO resource_spec.py:205 -- Starting Ray with 7.62 GiB memory available for workers and up to 3.81 GiB for objects. You can adjust these settings with ray.init(memory=<bytes>, object_store_memory=<bytes>).
2019-11-11 10:18:20,169 INFO scripts.py:333 -- 
Started Ray on this node. You can add additional nodes to the cluster by calling
ray start --redis-address 132.181.15.173:6379
from the node you wish to add. You can connect a driver to the cluster from Python by running
import ray
ray.init(redis_address="132.181.15.173:6379")
If you have trouble connecting from a different machine, check that your firewall is configured properly. If you wish to terminate the processes that have been started, run
ray stop
2019-11-11 10:18:20,221 INFO log_timer.py:21 -- NodeUpdater: cs19090bs: Ray start commands completed [LogTimer=4480ms]
2019-11-11 10:18:20,222 INFO log_timer.py:21 -- NodeUpdater: cs19090bs: Applied config 345f31e4c980153f1c40ae2c0be26b703d4bbfde [LogTimer=11804ms]
2019-11-11 10:18:20,224 INFO node_provider.py:85 -- ClusterState: Writing cluster state: ['cs19091bs', 'cs19093bs', 'cs19094bs', 'cs19095bs', 'cs19096bs', 'cs19090bs', 'cs19103bs', 'cs19102bs', 'cs19101bs', 'cs19100bs', 'cs19099bs', 'cs19098bs', 'cs19097bs']
2019-11-11 10:18:20,226 INFO commands.py:281 -- get_or_create_head_node: Head node up-to-date, IP address is: 132.181.15.173
To monitor auto-scaling activity, you can run:
ray exec cluster/cluster_config_local.yaml 'tail -n 100 -f /tmp/ray/session_*/logs/monitor*'
To open a console on the cluster:
ray attach cluster_config_local.yaml
To get a remote shell to the cluster manually, run:
ssh -i ~/.ssh/id_rsa user@132.181.15.173
bash: cannot set terminal process group (-1): Inappropriate ioctl for device
bash: no job control in this shell

此错误消息是无害的(应该在Ray中静音(。参见"如何告诉bash不要发出警告";不能设置终端进程组";以及";在这个外壳中没有作业控制";当它可以';t断言作业控制?。

最新更新