Mesosphere DC/OS安装:飞行后失败



各位,

我正在进行RUNNING ROST-FLIGHT,代理已完成,但主错误:

我检查使用:SSH到主机:

$journatl–u dcos参展商-b

-- Logs begin at Tue 2016-04-19 15:38:32 CST, end at Wed 2016-04-20 14:38:16 CST. --
Apr 20 12:40:31 worker02 systemd[1]: Started Exhibitor Zookeeper Supervisor.
Apr 20 12:40:31 worker02 systemd[1]: Starting Exhibitor Zookeeper Supervisor...
Apr 20 12:42:39 worker02 unshare[32443]: curl: (7) Failed to connect to 169.254.169.254 port 80: Connection timed out
Apr 20 12:42:39 worker02 unshare[32443]: inet_aton exited with illegal IP address string passed to inet_aton.  is not a valid IPv4 address
Apr 20 12:42:39 worker02 systemd[1]: dcos-exhibitor.service: main process exited, code=exited, status=1/FAILURE
Apr 20 12:42:39 worker02 systemd[1]: Unit dcos-exhibitor.service entered failed state.
Apr 20 12:42:39 worker02 systemd[1]: dcos-exhibitor.service failed.
Apr 20 12:42:44 worker02 systemd[1]: dcos-exhibitor.service holdoff time over, scheduling restart.

$journattl–u dcos mesos master-b

-- Logs begin at Tue 2016-04-19 15:38:32 CST, end at Wed 2016-04-20 14:46:38 CST. --
Apr 20 12:40:31 worker02 systemd[1]: Starting Mesos Master...
Apr 20 12:40:31 worker02 exhibitor_wait.py[32438]: Could not get exhibitor status: http://127.0.0.1:8181/exhibitor/v1/cluster/status
Apr 20 12:40:31 worker02 systemd[1]: dcos-mesos-master.service: control process exited, code=exited status=1
Apr 20 12:40:31 worker02 systemd[1]: Failed to start Mesos Master.
Apr 20 12:40:31 worker02 systemd[1]: Unit dcos-mesos-master.service entered failed state.
Apr 20 12:40:31 worker02 systemd[1]: dcos-mesos-master.service failed.
Apr 20 12:40:46 worker02 systemd[1]: dcos-mesos-master.service holdoff time over, scheduling restart.

$journalctl–u dcos mesos dns-b

Apr 20 12:41:00 worker02 mesos-dns[32467]: 2016/04/20 12:41:00 Failed to connect to 127.0.0.1:2181: dial tcp 127.0.0.1:2181: getsockopt: connection refused
Apr 20 12:41:01 worker02 mesos-dns[32467]: ERROR: 2016/04/20 12:41:01 main.go:80: master detection timed out after 30s
Apr 20 12:41:01 worker02 systemd[1]: dcos-mesos-dns.service: main process exited, code=exited, status=1/FAILURE
Apr 20 12:41:01 worker02 systemd[1]: Unit dcos-mesos-dns.service entered failed state.
Apr 20 12:41:01 worker02 systemd[1]: dcos-mesos-dns.service failed.
Apr 20 12:41:06 worker02 systemd[1]: dcos-mesos-dns.service holdoff time over, scheduling restart.

$journalctl–u dcos马拉松-b

-- Logs begin at Tue 2016-04-19 15:38:32 CST, end at Wed 2016-04-20 14:50:30 CST. --
Apr 20 12:40:31 worker02 systemd[1]: Starting Marathon...
Apr 20 12:40:31 worker02 exhibitor_wait.py[32476]: Could not get exhibitor status: http://127.0.0.1:8181/exhibitor/v1/cluster/status
Apr 20 12:40:32 worker02 systemd[1]: dcos-marathon.service: control process exited, code=exited status=1
Apr 20 12:40:32 worker02 systemd[1]: Failed to start Marathon.
Apr 20 12:40:32 worker02 systemd[1]: Unit dcos-marathon.service entered failed state.
Apr 20 12:40:32 worker02 systemd[1]: dcos-marathon.service failed.
Apr 20 12:40:47 worker02 systemd[1]: dcos-marathon.service holdoff time over, scheduling restart.

$journalctl–u dcos nginx-b

-- Logs begin at Tue 2016-04-19 15:38:32 CST, end at Wed 2016-04-20 14:51:49 CST. --
Apr 20 12:40:31 worker02 systemd[1]: Starting A high performance web server and a reverse proxy server...
Apr 20 12:40:31 worker02 curl[32468]: curl: (7) Failed to connect to localhost port 8101: Connection refused
Apr 20 12:40:31 worker02 systemd[1]: dcos-nginx.service: control process exited, code=exited status=7
Apr 20 12:40:31 worker02 systemd[1]: Failed to start A high performance web server and a reverse proxy server.
Apr 20 12:40:31 worker02 systemd[1]: Unit dcos-nginx.service entered failed state.
Apr 20 12:40:31 worker02 systemd[1]: dcos-nginx.service failed.
Apr 20 12:40:36 worker02 systemd[1]: dcos-nginx.service holdoff time over, scheduling restart.

$journalctl–u dcos gen resolvconf-b

-- Logs begin at Tue 2016-04-19 15:38:32 CST, end at Wed 2016-04-20 14:53:15 CST. --
Apr 20 12:40:31 worker02 systemd[1]: Started Update systemd-resolved for mesos-dns.
Apr 20 12:40:31 worker02 systemd[1]: Starting Update systemd-resolved for mesos-dns...
Apr 20 12:40:36 worker02 gen_resolvconf.py[32439]: Skipping DNS server 15.242.100.56: no response
Apr 20 12:41:32 worker02 systemd[1]: Started Update systemd-resolved for mesos-dns.
Apr 20 12:42:44 worker02 gen_resolvconf.py[32439]: curl: (7) Failed to connect to 169.254.169.254 port 80: Connection timed out
Apr 20 12:42:44 worker02 gen_resolvconf.py[32439]: inet_aton exited with illegal IP address string passed to inet_aton.  is not a valid IPv4 address
Apr 20 12:42:44 worker02 systemd[1]: dcos-gen-resolvconf.service: main process exited, code=exited, status=1/FAILURE
Apr 20 12:42:44 worker02 systemd[1]: Unit dcos-gen-resolvconf.service entered failed state.
Apr 20 12:42:44 worker02 systemd[1]: dcos-gen-resolvconf.service failed.

当SSH到代理时:

$journalctl–u dcos mesos slave-b

-- Logs begin at Tue 2016-04-19 15:32:51 CST, end at Wed 2016-04-20 14:55:23 CST. --
Apr 20 13:06:50 worker03 systemd[1]: Starting Mesos Slave...
Apr 20 13:06:51 worker03 ping[14893]: ping: unknown host leader.mesos
Apr 20 13:06:51 worker03 systemd[1]: dcos-mesos-slave.service: control process exited, code=exited status=2
Apr 20 13:06:51 worker03 systemd[1]: Failed to start Mesos Slave.
Apr 20 13:06:51 worker03 systemd[1]: Unit dcos-mesos-slave.service entered failed state.
Apr 20 13:06:51 worker03 systemd[1]: dcos-mesos-slave.service failed.
Apr 20 13:06:56 worker03 systemd[1]: dcos-mesos-slave.service holdoff time over, scheduling restart.

我不知道发生了什么事。你知道吗?非常感谢!

您选择的IP检测脚本出现问题。AWS是根据高级安装指南卷曲为http://169.254.169.254/latest/meta-data/local-ipv4的AWS。

我使用以下自定义ip检测脚本在本地的CentOS环境中安装了DCOS:

#!/usr/bin/env bash
set -o nounset -o errexit
export PATH=/usr/sbin:/usr/bin:$PATH
echo $(ip addr show virbr0 | grep -Eo '[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}' | head -1)

小心virbr0。我使用那个是因为我没有eth0接口。

最新更新