Ansible启动进程,等待检查,直到telnet条件成功



我触发多个Tomcat启动脚本,然后需要在尽可能快的时间内检查所有进程是否在多个主机上监听它们的特定端口。

对于测试用例,我编写了3个脚本,它们在单个主机上运行,并分别侦听端口4443,4445,4447,如下所示。

/tmp/startapp1.sh

while test 1 # infinite loop
sleep 10
do
nc -l localhost 4443 > /tmp/app1.log
done

/tmp/startapp2.sh

while test 1 # infinite loop
sleep 30
do
nc -l localhost 4445 > /tmp/app2.log
done

/tmp/startapp3.sh

while test 1 # infinite loop
sleep 20
do
nc -l localhost 4447 > /tmp/app3.log
done

下面是触发脚本并检查telnet是否成功的代码:

main.yml

- include_tasks: "internal.yml"
loop:
- /tmp/startapp1.sh 4443
- /tmp/startapp2.sh 4445
- /tmp/startapp3.sh 4447

internal.yml

- shell: "{{ item.split()[0] }}"
async: 600
poll: 0
- name: DEBUG CHECK TELNET
shell: "telnet {{ item.split()[1] }}"
delegate_to: localhost
register: telnetcheck
until: telnetcheck.rc == 0
async: 600
poll: 0
delay: 6
retries: 10
- name: Result of TELNET
async_status:
jid: "{{ item.ansible_job_id }}"
register: _jobs
until: _jobs.finished
delay: 6
retries: 10
with_items: "{{ telnetcheck.results }}"

运行:ansible-playbook main.yml

:上述三个脚本应该在大约30秒内与telnet一起启动。

因此,这里需要做的基本检查是telnetuntil: telnetcheck.rc == 0,但由于async, telnet shell模块没有rc的条目,因此我得到以下错误:

"msg": "The conditional check 'telnetcheck.rc == 0' failed. The error was: error while evaluating conditional (telnetcheck.rc == 0): 'dict object' has no attribute 'rc'"

在上面的代码中,我在哪里以及如何检查telnet是否成功,即telnetcheck.rc == 0,并确保满足要求?

目前我不知道一个解决方案,其中一个可以启动一个shell脚本,并等待它在一个任务的状态。可以根据必要的行为更改shell脚本,并让它提供自我检查和退出代码。或者您可以实现两个或多个任务,其中一个执行shell脚本,另一个稍后检查某些条件。

关于您的要求

等到telnet localhost 8076LISTENING (successful)

你可以看看模块wait_for

---
- hosts: localhost
become: false
gather_facts: false
tasks:
- name: "Test connection to local port"
wait_for:
host: localhost
port: 8076
delay: 0
timeout: 3
active_connection_states: SYN_RECV
check_mode: false # because remote module (wait_for) does not support it
register: result
- name: Show result
debug:
msg: "{{ result }}"

进一步Q&

  • 如何将Ansible模块wait_forloop一起使用
  • 防火墙功能测试

从远程节点上的控制节点测试本地主机上是否存在侦听器的另一种方法是

---
- hosts: test.example.com
become: true
gather_facts: false
vars:
PORT: "8076"
tasks:
- name: "Check for LISTENER on remote localhost"
shell:
cmd: "lsof -Pi TCP:{{ PORT }}"
changed_when: false
check_mode: false
register: result
failed_when: result.rc != 0 and result.rc != 1
- name: Report missing LISTENER
debug:
msg: "No LISTENER on PORT {{ PORT }}"
when: result.rc == 1

在同一个任务中使用异步操作和until几乎没有意义。

至于你要求在尽可能快的时间内得到答案,你必须重新考虑一下。对于您的三个端口情况,如果您希望在执行任务之前打开所有端口,那么无论如何,它总是与打开最慢的端口一样慢。即使我们探测的第一个确实是最慢的,其他两个也会立即探测,所以,在async中尝试优化它,在我看来,是不必要的优化。

或者您想使用until,然后每个端口探测将被卡住,直到它们回答,或者您想异步运行它们,并且async_status将捕获返回,因为如果您将telnet包装在shelluntil循环中。

在您的until循环中,问题是在命令确实返回之前不会设置返回代码,因此您只需要检查字典的rc键是否定义。

请注意,对于下面的所有示例,我都是用nc -l -p <port>手动打开端口,这就是为什么它们会逐渐打开的原因。


Withuntil:

- shell: "telnet localhost {{ item.split()[1] }}"
delegate_to: localhost
register: telnetcheck
until:
- telnetcheck.rc is defined
- telnetcheck.rc == 0
delay: 6
retries: 10

这将产生:

TASK [shell] *****************************************************************
FAILED - RETRYING: [localhost]: shell (10 retries left).
changed: [localhost] => (item=/tmp/startapp1.sh 4443)
FAILED - RETRYING: [localhost]: shell (10 retries left).
changed: [localhost] => (item=/tmp/startapp2.sh 4445)
FAILED - RETRYING: [localhost]: shell (10 retries left).
changed: [localhost] => (item=/tmp/startapp3.sh 4447)

Withasync:

- shell: "until telnet 127.0.0.1 {{ item.split()[1] }}; do sleep 2; done"
delegate_to: localhost
register: telnetcheck
async: 600
poll: 0
- async_status:
jid: "{{ item.ansible_job_id }}"
register: _jobs
until: _jobs.finished
delay: 6
retries: 10
loop: "{{ telnetcheck.results }}"
loop_control:
label: "{{ item.item }}"

这将产生:

TASK [shell] *****************************************************************
changed: [localhost] => (item=/tmp/startapp1.sh 4443)
changed: [localhost] => (item=/tmp/startapp2.sh 4445)
changed: [localhost] => (item=/tmp/startapp3.sh 4447)
TASK [async_status] **********************************************************
FAILED - RETRYING: [localhost]: async_status (10 retries left).
changed: [localhost] => (item=/tmp/startapp1.sh 4443)
FAILED - RETRYING: [localhost]: async_status (10 retries left).
changed: [localhost] => (item=/tmp/startapp2.sh 4445)
FAILED - RETRYING: [localhost]: async_status (10 retries left).
changed: [localhost] => (item=/tmp/startapp3.sh 4447)

这就是说,你必须认真考虑@U880D的答案,因为这是一个更原生的答案:

- wait_for:
host: localhost
port: "{{ item.split()[1] }}"
delay: 6
timeout: 60

这将产生:

TASK [wait_for] **************************************************************
ok: [localhost] => (item=/tmp/startapp1.sh 4443)
ok: [localhost] => (item=/tmp/startapp2.sh 4445)
ok: [localhost] => (item=/tmp/startapp3.sh 4447)

相关内容

  • 没有找到相关文章

最新更新